summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: clean up CP DMA emit codeMarek Olšák2016-09-131-84/+60
| | | | | | | | Unify the clear and copy paths, clean up the definitions. It looks more like a rework. It's a preparation for GDS support, which might or might not come. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: print the IB and buffer list in VM fault reportsMarek Olšák2016-09-131-1/+2
| | | | | | This is a fallout from reworking the debug flags. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add sampler view BOs to the BO list lastMarek Olšák2016-09-131-7/+10
| | | | | | | | | If si_sampler_view_add_buffer ends up flushing, then the code in begin_new_cs would previously have added the buffer(s) for whatever was previously bound to that slot. Now it would add only the new buffer. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: export SampleMask from pixel shaders at full rateMarek Olšák2016-09-133-16/+56
| | | | | | | Heaven and Valley write gl_SampleMask and not Z. Use 16_ABGR instead of 32_ABGR if Z isn't written. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: set new r600_resource fields correctly in other places tooMarek Olšák2016-09-131-0/+11
| | | | | | | | | | | | | This was missed in: commit 0d2e43fcb1198a6e67c85feadb1ca8c360ddc284 Author: Marek Olšák <[email protected]> Date: Thu Aug 18 16:30:00 2016 +0200 gallium/radeon: derive buffer placement and flags only at initialization Tested-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: dump shader buffers and imagesMarek Olšák2016-09-133-3/+49
| | | | | | this was unimplemented Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: fix a crash in resource_get_handleMarek Olšák2016-09-131-1/+1
| | | | | | broken recently Reviewed-by: Nicolai Hähnle <[email protected]>
* radeon: Don't check DCC on pipe buffersJan Vesely2016-09-131-3/+4
| | | | | | | | | Fixes segfaults in EG compute since: commit 21de3be8e62b2b093569a99550e6356ed2f106b4 radeonsi: fix texture format reinterpretation with DCC Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* vl/util: Fix YV12/I420 convert to NV12 U/V reversalAndy Furniss2016-09-131-1/+1
| | | | | | | | Fix VAAPI YV12/I420 convert to NV12 U/V reversal. Input order is YVU when this is called. Signed-off-by: Andy Furniss <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]>
* anv/allocator: Use VG_NOACCESS_WRITE in anv_bo_pool_freeJason Ekstrand2016-09-131-2/+4
| | | | | | | | | | | | Previously, we were relying on the fact that VALGRIND_MEMPOOL_FREE came later on in the function to prevent "link->bo = bo" from causing an invalid write. However, in the case where the size requested by the user is very small (less than sizeof(struct anv_bo)), this isn't sufficient. Instead, we should call VALGRIND_MEMPOOL_FREE early and then use VG_NOACCESS_WRITE. We do, however, have to call VALGRIND_MEMPOOL_FREE after reading bo_in because it may be stored in the bo itself. Signed-off-by: Jason Ekstrand <[email protected]>
* intel/isl: Ignore base_array_layer and array_len for 3D storage surfacesJason Ekstrand2016-09-131-2/+6
| | | | | | | | | | | | | | The time we want to restrict the Z range of a 3-D surface is when rendering to it. For storage surfaces, we always want he full range. However, we still need to set MinimumArrayElement and RenderTargetViewExtent to sensible values so we'll just set them to the reasonable defaults we used before we started respecting the base_array_layer and array_len. This fixes a bunch of Vulkan CTS regressions caused by 48f195d7c6483ed. Signed-off-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97790 Reviewed-by: Chad Versace <[email protected]>
* i965: Use blorp_copy for all copy_image operations on gen6+Jason Ekstrand2016-09-121-22/+6
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/blorp: Add a copy_miptrees helperJason Ekstrand2016-09-122-0/+79
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* intel/isl: Add support for RGB formats in X and Y-tiled memoryJason Ekstrand2016-09-122-14/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Normally, using a non-linear tiling format helps improve cache locality by ensuring that neighboring pixels are usually close-by in memory. For RGB formats, this still sort-of holds, but it can also lead to rather terrible memory access patterns where a single RGB pixel value crosses a tile boundary and gets split into two pieces in different 4K pages. It also makes for some rather awkward calculations because your tile size is no longer an even multiple of surface element size. For these reasons, we chose to simply never create tiled RGB images in the Vulkan driver. The GL driver, however, is not so kind so we need to support it somehow. I briefly toyed with a couple of different schemes but this is the best one I could come up with. The fundamental problem is that a tile no longer contains an integer number of surface elements. I briefly considered a couple other options but found them wanting: 1) Using floats for the logical tile size. This leads to potential rounding error problems. 2) When presented with a RGB format, just make the tile 3-times as wide. This isn't so nice because now our tiles are no longer power-of-two size. Also, it can force the row_pitch to be larger than needed which, while not strictly a problem for ISL, causes incompatibility problems with the way the GL driver chooses surface pitches. The chosen method requires that you pay attention and not just assume that your tile_info is in the units you think it is. However, it's nice because it provides a nice "these are the units" declaration in isl_tile_info itself. Previously, the tile_info wasn't usable as a stand-alone structure because you had to also know the format. It also forces figuring out how to deal with inconsistencies between tiling and format back to the caller which is good because the two different consumers of isl_tile_info really want to deal with it differently: Computation of the surface size wants the fewest number of horizontal tiles possible while get_intratile_offset is far more concerned with things aligning nicely. Signed-off-by: Jason Ekstrand <[email protected]> Acked-by: Chad Versace <[email protected]>
* intel/isl: Allow valign2 for texture-only Y-tiled surfaces on gen7Jason Ekstrand2016-09-121-1/+2
| | | | | | | | | | The restriction that Y-tiled surfaces must have valign == 4 only aplies to render targets but we were applying it universally. This causes problems if ISL_FORMAT_R32G32B32_FLOAT is used because it requires valign == 2; this should be okay because you can't render to that format. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* intel/blorp: Work in terms of logical array layersJason Ekstrand2016-09-122-16/+29
| | | | | | | | | | | | | | | | | | When Ivy Bridge introduced array multisampling, someone made the decision to do lots of stuff throughout the driver in terms of physical array layers rather than logical array layers. In ISL, we use logical array layers most of the time and it really makes no sense to use physical array layers in the blorp API. Every time someone passes physical array layers into blorp for an array multisampled surface, they're always divisible by the number of samples and we divide right away. Eventually, I'd like to rework most of the GL driver internals to use logical array layers but that's going to be a big project and will probably happen as part of the ISL conversion. For now, we'll do the conversion in brw_blorp and let blorp just use the logical layers. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Increase the presision of coordinate transform calculationsJason Ekstrand2016-09-121-3/+3
| | | | | | | | | | | | The result of this calculation goes into an fma() in the shader and we would like it to be as precise as possible. The division in particular was a source of imprecision whenever dst1 - dst0 was not a power of two. This prevents regressions in some of the new Vulkan CTS tests for blitting using a filtering of NEAREST. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/blorp: Add a swizzle parameter to blorp_clearJason Ekstrand2016-09-123-4/+9
| | | | | | | | While we're here, we also re-arrange the parameters to better match the parameter order of blorp_blit. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Make color_write_disable const and optionalJason Ekstrand2016-09-122-6/+8
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Add support for clearing R9G9B9E5 surfacesJason Ekstrand2016-09-121-0/+8
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Add support for RGB destinations in copiesJason Ekstrand2016-09-122-0/+69
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Add an entrypoint for doing bit-for-bit copiesJason Ekstrand2016-09-122-0/+143
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Pull the guts of blorp_blit into a helperJason Ekstrand2016-09-121-130/+147
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Stop using the X/YOffset field of RENDER_SURFACE_STATEJason Ekstrand2016-09-123-9/+89
| | | | | | | | | | | | | While it can be useful, the field has substantial limtations. In particular, the bittom 2 or 3 bits is missing so your offset always has to be a multiple of 4 or 8. While surface alignments usually work out to make this ok, when you start trying to fake compressed surfaces as uncompressed (which we will want to do) this falls apart. The easiest solution is to simply align all offsets to a tile boundary and munge the regions we're copying to account for the intratile offset. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Use fake_interleaved_msaa in retile_w_to_yJason Ekstrand2016-09-121-3/+1
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Use isl_get_interleaved_msaa_px_size_saJason Ekstrand2016-09-121-28/+6
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Add a helper for getting the size of an interleaved pixelJason Ekstrand2016-09-122-5/+20
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Handle 3D surfaces in convert_to_single_sliceJason Ekstrand2016-09-121-5/+11
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Fix an assert in get_intratile_offset_saJason Ekstrand2016-09-121-1/+1
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Fix the early return condition in convert_to_single_sliceJason Ekstrand2016-09-121-1/+6
| | | | | | | | | | | | | | The convert_to_single_slice operation is *mostly* idempotent. The only non-repeatable thing it does is that, when it sets the intratile offset fields, it just overwrites them instead of doing a += operation. This is supposed to be ok because we have an early return at the top that should make it bail of the surface is already a single slice. Unfortunately, the if condition has been broken ever since it was first added in 96fa98c18. This commit fixes the condition and adds an assert to ensure we don't stomp any non-zero intratile offsets. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Use the surface format for computing offsetsJason Ekstrand2016-09-121-1/+1
| | | | | | | | | If we use the view format, it may be an uncompressed view of a compressed image which throws things off. Since we're computing offsets of images, we want the actual surface offset anyway. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Don't assume R8_UINT in convert_to_single_sliceJason Ekstrand2016-09-121-1/+1
| | | | | | | We're going to use it for more than just stencil textures Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Take a destination swizzle in blorp_blitJason Ekstrand2016-09-123-2/+4
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Take an isl_swizzle instead of a SWIZZLEJason Ekstrand2016-09-123-29/+30
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Add an isl_swizzle structure and use it for isl_view swizzlesJason Ekstrand2016-09-128-64/+48
| | | | | | | | | This should be more compact than the enum isl_channel_select[4] that we were using before. It's also very convenient because we already had such a structure in the Vulkan driver we just needed to pull it over. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* st/mesa: fix is_scissor_enabled when X/Y are negativeIlia Mirkin2016-09-121-4/+6
| | | | | | | | | | | | | | Similar to commit 49c24d8a24 ("i965: fix noop_scissor range issue on width/height") - take the X/Y into account to determine whether the scissor covers the whole area or not. Fixes the recently-added gl-1.0-scissor-depth-clear-negative-xy piglit test. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: <[email protected]>
* android: add support for libmesa_amdgpu_addrlibMauro Rossi2016-09-136-6/+83
| | | | | | | | | | | Android porting of the following commits: f1f1ba3 "radeonsi: move sid.h/r600d_common.h to a common place." 69fca64 "amd/addrlib: move addrlib from amdgpu winsys to common code" This patch fixes android building errors Reviewed-by: Dave Airlie <[email protected]>
* u_endian: add android to glibc clauseDave Airlie2016-09-131-1/+1
| | | | Tested-by: Mauro Rossi <[email protected]>
* Revert "i965: Drop the maximum 3D texture size to 512 on Sandy Bridge"Jason Ekstrand2016-09-121-10/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 6ba88bce64b343761aabe3a6c7ee285c6020a959. The commit was erroneous because GL has a separate limit, GL_MAX_FRAMEBUFFER_LAYERS which guards the number of layers you are allowed to render into. The GL 4.5 spec says: "The framebuffer attachment point attachment is said to be framebuffer attachment complete if [...] all of the following conditions are true: [...] If image is a three-dimensional, one- or two-dimensional array, or cube map array texture and the attachment is layered, the depth or layer count of the texture is less than or equal to the value of the implementation-dependent limit MAX_FRAMEBUFFER_LAYERS." and goes on to say that "framebuffer complete" requires all attachments to be "framebuffer attachment complete". On Sandy Bridge, we set GL_MAX_FRAMEBUFFER_LAYERS to 512 so creating a 3D texture bigger than 512 is fine; you just can't render into all of the slices at once. Fixes ES3-CTS.gtf.GL3Tests.npot_textures.npot_tex_image on Sandy Bridge Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* intel/blorp: Handle the 512 layers restriction on Sandy BridgeJason Ekstrand2016-09-122-4/+19
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/isl: Treat 3-D textures as 2-D arrays for renderingJason Ekstrand2016-09-122-4/+13
| | | | | | | | | | In particular, this means that isl_view::base_array_layer and isl_view::array_len get applied to 3-D textures but only when rendering. We were already applying isl_view::base_array_layer for rendering into 3-D textures so this isn't a huge deviation. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* aubinator: Simplify gen_disasm_create()'s devinfo handlingSirisha Gandikota2016-09-121-7/+1
| | | | | | | | | | Copy the whole devinfo structure instead of just few fields (Ken) Earlier, copied only couple of fields which added more code. So, simplify code by copying the whole structure. Signed-off-by: Sirisha Gandikota <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* aubinator: Fix compiler warningSirisha Gandikota2016-09-121-1/+1
| | | | | | | Add 'const' qualifier to gen_field_iterator::p pointer (Ken) Signed-off-by: Sirisha Gandikota <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* st/va: also honors interlaced preference when providing a video formatJulien Isorce2016-09-121-17/+19
| | | | | | | | | | | | | | | This fixes a crash when using the prefered video format with vaapisink on Nvidia hardwares. Also caught by the following assert: nouveau_vp3_video.c:91: Assertion `templat->interlaced' failed. TEST= gst-launch-1.0 videotestsrc ! video/x-raw, format=NV12 ! vaapisink Cc: <[email protected]> Signed-off-by: Julien Isorce <[email protected]> Tested-by: Víctor Manuel Jáquez Leal <[email protected]> Tested-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]>
* tgsi: document semantics for compute shadersSamuel Pitoiset2016-09-121-0/+28
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: Enable OES/EXT_tessellation_shader for ES 3.1 + ARB_tess drivers.Kenneth Graunke2016-09-121-4/+4
| | | | | | | | | Drivers which support ARB_tessellation_shader and ES 3.1 now will expose OES_tessellation_shader and EXT_tessellation_shader as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* radeonsi: don't preload constants at the beginning of shadersMarek Olšák2016-09-121-20/+11
| | | | | | | | | | | | | | | | | | | | | | LLVM can CSE the loads, thus we can always re-load constants before each use. The decrease in SGPR spilling is huge. The best improvements are the dumbest ones. 26011 shaders in 14651 tests Totals: SGPRS: 1453346 -> 1251920 (-13.86 %) VGPRS: 742576 -> 728421 (-1.91 %) Spilled SGPRs: 52298 -> 16644 (-68.17 %) Spilled VGPRs: 397 -> 369 (-7.05 %) Scratch VGPRs: 1372 -> 1344 (-2.04 %) dwords per thread Code Size: 36136488 -> 36001064 (-0.37 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 219315 -> 222221 (1.33 %) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* intel/blorp: Add a TODO fileJason Ekstrand2016-09-121-0/+16
| | | | | | | This provides a nice little place to share notes on what still needs to be done and/or would be nice to have in BLORP. Signed-off-by: Jason Ekstrand <[email protected]>
* i965: check for GL_TEXTURE_EXTERNAL_OES at miptree_create_for_teximageAlejandro Piñeiro2016-09-121-0/+1
| | | | | | | | | | | | | | | Forgotten on commit "i965: Fix calculation of the image height at start level". Thanks to Ilia Mirkin for point it. Fixes the following regressions on Haswell and Broadwell: ES2-CTS.gtf.GL2ExtensionTests.egl_image_external.TestSimpleUnassociated (crash back to pass) ES2-CTS.gtf.GL2ExtensionTests.egl_image_external.TestSimple (crash back to fail) ES2-CTS.gtf.GL2ExtensionTests.egl_image_external.TestVertexShader (crash back to fail) https://bugs.freedesktop.org/show_bug.cgi?id=97761 Reviewed-by: Jason Ekstrand <[email protected]>
* gbm: fix potential NULL deref of mapImage/unmapImage.Chuanbo Weng2016-09-121-2/+3
| | | | | | | | | The mapImage/unmapImage functions of DRIimage extension can be NULL, so we should add additional check for them. Cc: <[email protected]> Signed-off-by: Chuanbo Weng <[email protected]> Reviewed-by: Emil Velikov <[email protected]>