aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* v3d: Merge the V3D 4.1 and 4.2 XML into V3D 3.3'x XML.Eric Anholt2018-06-297-2162/+607
| | | | | | The XML ends up noisier if you're only looking at one version, but from the diffstat there's obvious wins in terms of deduplication. This will get even more significant if we ever support 3.2 or 4.0.
* v3d: Switch v3d_decoder.c to the XML's top min_ver/max_ver fields.Eric Anholt2018-06-295-6/+14
| | | | | | | The XML zipper wants one XML per version for filling out its tables, but we want to do more than one GPU version per XML now. Assume that the "gen" field will be the same as min_ver and look up our XML text assuming that they're listed in increasing min_ver.
* v3d: Create XML fields for min_ver and max_ver of a packet/struct/enum.Eric Anholt2018-06-292-2/+82
| | | | | This will be used to merge together the V3D 3.3-4.1 XML with the variants disabled based on the version.
* v3d: Pass the version being generated to the pack generator script.Eric Anholt2018-06-294-20/+22
| | | | | | | It turns out that most V3D versions change very few packets, so keeping separate copies of the XML per version makes changing the XML a pain as you have to replicate your changes to each one. This is the start of changing it so that one XML can generate headers for multiple versions.
* anv: finish the binding_table_pool on destroyDevice when use_softpinJose Maria Casanova Crespo2018-06-291-0/+3
| | | | | | | | | | | | | | | | | Running VK-CTS in batch execution mode was raising the VK_ERROR_INITIALIZATION_FAILED error in multiple tests. But when the same failing tests were run isolated they always passed. createDevice and destroyDevice were called before and after every tests. Because the binding_table_pool was never closed, we reached the maximum number of open file descriptors (ulimit -n) and when that happened every call to createDevice implied a VK_ERROR_INITIALIZATION_FAILED error. Fixes: c7db0ed4e94dce563d722e1b098684fbd7315d51 ("anv: Use a separate pool for binding tables when soft pinning") Reviewed-by: Jason Ekstrand <[email protected]>
* gallium/util: remove dummy function util_format_is_supportedMarek Olšák2018-06-2916-57/+6
| | | | Reviewed-by: Eric Engestrom <[email protected]>
* nv50/ir: improve maintainability of Target*::initOpInfo()Rhys Perry2018-06-292-23/+28
| | | | | | | | | | This is mainly useful for when one needs to add new opcodes in a painless and reliable way. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* nv50/ir: fix image stores with indirect handlesRhys Perry2018-06-291-4/+5
| | | | | | | | | | Having this if statement here prevented the next if statement from being reached in the case of image stores, which is needed for instructions with indirect bindless handles like "STORE TEMP[ADDR[2].x+1](1) ...". Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* egl: fix build race in automakeRoss Burton2018-06-291-0/+1
| | | | | | | | | | | | | | | | | There is a parallel make build issue in src/egl/drivers/dri2/ for wayland builds. Can be reproduced with: $ rm src/egl/drivers/dri2/*.h src/egl/drivers/dri2/platform_wayland.lo $ make -C src/egl/ drivers/dri2/platform_wayland.lo ../../../mesa-18.1.2/src/egl/drivers/dri2/platform_wayland.c:50:10: fatal error: linux-dmabuf-unstable-v1-client-protocol.h: No such file or directory This patch adds the missing dependency. Fixes: 02cc359372773800de817 "egl/wayland: Use linux-dmabuf interface for buffers" Reviewed-by: Eric Engestrom <[email protected]> [Eric: fixed up the commit title] Signed-off-by: Eric Engestrom <[email protected]>
* radeonsi: implement vertex color clamping for tess and GSMarek Olšák2018-06-284-33/+87
|
* radeonsi: move VS_STATE_SGPR before draw SGPRsMarek Olšák2018-06-282-10/+13
| | | | for vertex color clamping.
* radeonsi: don't use malloc in si_generate_gs_copy_shaderMarek Olšák2018-06-281-10/+2
|
* radeonsi: disable DCC statistics gathering on everything but StoneyMarek Olšák2018-06-281-3/+2
| | | | I think we don't need it on other chips.
* radeonsi: don't enable DCC statistics gathering for small surfacesMarek Olšák2018-06-281-14/+16
|
* radeonsi: simplify logic around vi_separate_dcc_try_enableMarek Olšák2018-06-282-14/+15
|
* radeonsi: fix memory exhaustion issue with DCC statistics gathering with DRI2Marek Olšák2018-06-281-3/+27
| | | | Cc: 18.1 <[email protected]>
* radeonsi: remove references to EvergreenMarek Olšák2018-06-284-4/+2
|
* radeonsi: enable shader caching for compute shadersMarek Olšák2018-06-283-15/+50
| | | | Compute shaders were not using the shader cache.
* radeonsi: store compute local_size into tgsi_shader_infoMarek Olšák2018-06-284-6/+10
| | | | This is kinda a hack, but it's enough for the shader cache.
* radeonsi: unify duplicated code for initial shader compilationMarek Olšák2018-06-283-43/+39
|
* ac: set +auto-waitcnt-before-barrier when neededMarek Olšák2018-06-281-2/+4
| | | | | This removes useless s_waitcnt before barriers. Only radeonsi uses this function.
* radeonsi/gfx9: insert the barrier between merged shaders inside the if blockMarek Olšák2018-06-281-5/+13
|
* gallium: plumb invariant output attrib thru TGSIJoe M. Kniss2018-06-296-15/+45
| | | | | | | | | | | | Add support for glsl 'invariant' modifier for output data declarations. Gallium drivers that use TGSI serialization currently loose invariant modifiers in glsl shaders. v2: use boolean for invariant instead of unsigned. Tested: chromiumos on qemu with virglrenderer. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* intel/fs: Build 32-wide FS shaders.Francisco Jerez2018-06-281-11/+43
| | | | Co-authored-by: Jason Ekstrand <[email protected]>
* intel/anv,blorp,i965: Implement the SKL 16x MSAA SIMD32 workaroundJason Ekstrand2018-06-283-2/+49
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/fs: Add fields to wm_prog_data for SIMD32 dispatchJason Ekstrand2018-06-286-1/+15
| | | | Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix nir_intrinsic_load_helper_invocation for SIMD32.Francisco Jerez2018-06-281-5/+9
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix fs_builder::sample_mask_reg() for 32-wide FS dispatch.Francisco Jerez2018-06-281-3/+3
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix Gen6+ interpolation setup for SIMD32Francisco Jerez2018-06-281-56/+60
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Get rid of MOV_DISPATCH_TO_FLAGSJason Ekstrand2018-06-285-35/+8
| | | | | | We can just emit the MOV in the two places where we use this. Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Emit MOV_DISPATCH_TO_FLAGS once for the centroid workaroundJason Ekstrand2018-06-282-50/+16
| | | | | | | There's no reason for us to emit it a pile of times and then have a whole pass to clean it up. Just emit it once like we really want. Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Generalize the unlit centroid workaroundFrancisco Jerez2018-06-281-14/+8
| | | | | | | | This generalizes the unlit centroid workaround so it's less code and now supports SIMD32. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix sample id setup for SIMD32.Francisco Jerez2018-06-281-9/+25
| | | | | | | | v2 (Jason Ekstrand): - Disallow gl_SampleId in SIMD32 on gen7 Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix Gen7 compressed source region alignment restriction for SIMD32Francisco Jerez2018-06-281-1/+7
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Implement 32-wide FS payload setup on Gen6+Francisco Jerez2018-06-281-67/+57
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Extend thread payload layout to SIMD32Francisco Jerez2018-06-283-22/+45
| | | | | | | | | | And handle 32-wide payload register reads in fetch_payload_reg(). v2 (Jason Ekstrand); - Fix some whitespace and brace placement Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Wrap FS payload register look-up in a helper function.Francisco Jerez2018-06-283-12/+23
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Use fs_regs instead of brw_regs in the unlit centroid workaroundFrancisco Jerez2018-06-281-12/+12
| | | | | | | | | | While we're here, we change to using horiz_offset() instead of abusing half(). v2 (Jason Ekstrand): - Use horiz_offset() instead of half() Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Simplify fs_visitor::emit_samplepos_setupFrancisco Jerez2018-06-281-21/+7
| | | | | | | | | | | | | The original code manually handled splitting the MOVs to 8-wide to handle various regioning restrictions. Now that we have a SIMD width splitting pass that handles these things, we can just emit everything at the full width and let the SIMD splitting pass handle it. We also now have a useful "subscript" helper which is designed exactly for the case where you want to take a W type and read it as a vector of Bs so we may as well use that too. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add plumbing for shader time in 32-wide FS dispatch mode.Francisco Jerez2018-06-287-5/+15
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Disable opt_sampler_eot() in 32-wide dispatch.Francisco Jerez2018-06-282-1/+6
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Emit LINE+MAC for LINTERP with unaligned coordinatesJason Ekstrand2018-06-282-10/+56
| | | | | | | | | | | | | | | | | | | | | On g4x through Sandy Bridge, src1 (the coordinates) of the PLN instruction is required to be an even register number. When it's odd (which can happen with SIMD32), we have to emit a LINE+MAC combination instead. Unfortunately, we can't just fall through to the gen4 case because the input registers are still set up for PLN which lays out the four src1 registers differently in SIMD16 than LINE. v2 (Jason Ekstrand): - Take advantage of both accumulators and emit LINE LINE MAC MAC (Based on a patch from Francisco Jerez) - Unify the gen4 and gen4x-6 cases using a loop v3 (Jason Ekstrand): - Don't unify gen4 with gen4x-6 as this turns out to be more fragile than first thought without reworking the gen4 barycentric coordinate layout. Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLNJason Ekstrand2018-06-281-1/+2
| | | | | | | | | | | | | | When we don't have PLN (gen4 and gen11+), we implement LINTERP as either LINE+MAC or a pair of MADs. In both cases, the accumulator is written by the first of the two instructions and read by the second. Even though the accumulator value isn't actually ever used from a logical instruction perspective, it is trashed so we need to make the scheduler aware. Otherwise, the scheduler could end up re-ordering instructions and putting a LINTERP between another an instruction which writes the accumulator and another which tries to use that result. Cc: [email protected] Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Rework INTERPOLATE_AT_PER_SLOT_OFFSETFrancisco Jerez2018-06-283-19/+9
| | | | | | | | | | This reworks INTERPOLATE_AT_PER_SLOT_OFFSET to work more like an ALU operation and less like a send. This is less code over-all and, as a side-effect, it now properly handles execution groups and lowering so SIMD32 support just falls out. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Add the group to the flag subreg number on SNB and olderJason Ekstrand2018-06-281-1/+7
| | | | | | | | | | | We want consistent behavior in the meaning of the flag_subreg field between SNB and IVB+. v2 (Jason Ekstrand): - Add some extra commentary Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix FB read header setup for SIMD32.Francisco Jerez2018-06-281-4/+13
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix logical FB write lowering for SIMD32Francisco Jerez2018-06-281-5/+20
| | | | Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix FB write message control codegen for SIMD32.Francisco Jerez2018-06-281-18/+34
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Don't enable dual source blend if no outputs are writtenFrancisco Jerez2018-06-281-1/+2
| | | | | | | | This prevents a crash in some arb_enhanced_layouts tests that would be caused by the next commit. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix codegen of FS_OPCODE_SET_SAMPLE_ID for SIMD32.Francisco Jerez2018-06-281-11/+13
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>