summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radeon/uvd: fail to create a decoder if RUVD_MSG_CREATE submission failsMarek Olšák2016-07-141-6/+9
| | | | | | This is the bare minimum for reporting the error to the user. Reviewed-by: Christian König <[email protected]>
* winsys/amdgpu: return an error on IB submission failuresMarek Olšák2016-07-142-1/+9
| | | | Reviewed-by: Christian König <[email protected]>
* gallium/radeon: add a return value to cs_flushMarek Olšák2016-07-143-9/+13
| | | | | | Required by our UVD code. Reviewed-by: Christian König <[email protected]>
* glsl/types: Use _mesa_hash_data for hashing function typesJason Ekstrand2016-07-141-14/+2
| | | | | | | | | | This is way better than the stupid string approach especially since you could overflow the string. Again, I thought I had something better at one point but it obviously got lost. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Cc: "12.0" <[email protected]>
* glsl/types: Fix function type comparison functionJason Ekstrand2016-07-141-1/+1
| | | | | | | | | | It was returning true if the function types have different lengths rather than false. This was new with the SPIR-V to NIR pass and I thought I'd fixed it a while ago but it may have gotten lost in rebasing somewhere. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Cc: "12.0" <[email protected]>
* freedreno/a4xx: Fix sign compare warnings[email protected]2016-07-141-7/+7
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: Fix sign compare warnings[email protected]2016-07-141-7/+7
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a2xx: Fix sign compare warnings[email protected]2016-07-141-4/+4
| | | | Signed-off-by: Rob Clark <[email protected]>
* radeon/vce: handle newly added parametersBoyuan Zhang2016-07-141-13/+20
| | | | | | | Replace the previous hardcoded value with newly defined parameters Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]>
* st/omx: assign previous values to new structureBoyuan Zhang2016-07-141-0/+10
| | | | | | | | Assign previously hardcoded values for OMX to newly defined structure. As a result, OMX behaviour will not change at all. Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]>
* vl: add parameters for VAAPI encodeBoyuan Zhang2016-07-141-0/+33
| | | | | | | | Allow to specify more parameters in the encoding interface which previously just hardcoded in the encoder Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]>
* st/mesa: fix reference counting bug in st_vdpauChristian König2016-07-141-2/+8
| | | | | | | | | Otherwise we leak the resources created for the DMA-buf descriptors. Signed-off-by: Christian König <[email protected]> Cc: 12.0 <[email protected]> Tested-and-Reviewed by: Leo Liu <[email protected]> Ack-by: Tom St Denis <[email protected]>
* vc4: Emit resets of the uniform stream at the starts of blocks.Eric Anholt2016-07-139-0/+167
| | | | | | | | If a block might be entered from multiple locations, then the uniform stream will (probably) be at different points, and we need to make sure that it's pointing where we expect it to be. The kernel also enforces that any block reading a uniform resets uniforms, to prevent reading outside of the uniform stream by using looping.
* vc4: Add support for scheduling of branch instructions.Eric Anholt2016-07-132-17/+114
| | | | For now we don't fill the delay slots, and instead just drop in NOPs.
* vc4: Move the QPU instructions to schedule into each block.Eric Anholt2016-07-134-141/+180
| | | | We'll want to schedule them individually, to handle delay slots.
* vc4: Disable vc4_opt_vpm in the presence of control flow.Eric Anholt2016-07-131-0/+5
| | | | | | It's a really valuable pass currently, but it will be a mess to rewrite for control flow. For now, just disable it if we have multiple blocks present.
* vc4: Convert vc4_opt_dead_code to work in the presence of control flow.Eric Anholt2016-07-131-18/+29
| | | | | | | | | | | | With control flow, we can't be sure that we'll see the uses of a variable before its def as we walk backwards. Given that NIR is eliminating our long chains of dead code, a simple solution for now seems fine. This slightly changes the order of some optimizations, and so an opt_vpm happens before opt_dce, causing 3 dead MOVs to be turned into dead FMAXes in Minecraft: instructions in affected programs: 52 -> 54 (3.85%)
* vc4: Update copy propagation for control flow.Eric Anholt2016-07-131-62/+137
| | | | | | | | | | | | | | Previously, we could assume that a MOV from a temp was always an available copy, because all temps were SSA in NIR, and their non-SSA state in QIR was just due to the fact that they were from a bcsel or pack_unorm_4x8, so we could use the current value of the temp after that series of QIR instructions to define it. However, this is no longer the case with control flow. Instead, we track a new array of MOVs defined within the block that haven't had their source or dest killed yet, and use that primarily. We fall back to looking through the QIR defs array to handle across-block MOVs, but now require that copies from the SSA defs have an SSA src as well.
* i965/fs: emit DIM instruction to load 64-bit immediates in HSWSamuel Iglesias Gonsálvez2016-07-141-0/+10
| | | | | | | | v2 (Matt): - Use brw_imm_df() as source argument of DIM instruction. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: set DF imm value to the source of DIMSamuel Iglesias Gonsálvez2016-07-141-1/+2
| | | | | | | | | | | | | | | According to HSW's PRM, vol02b, the DIM instruction has the following restriction: "Restriction : src0 must be immediate. src0 must specify the :f (F, Float) type encoding but is an immediate 64-bit DF (Double Float) value. dst must have type DF." This commit allows to upload the immediate 64-bit DF value to the source of a DIM instruction even when it is of float type encoding. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: enable the emission of the DIM instructionSamuel Iglesias Gonsálvez2016-07-1410-2/+23
| | | | | | | | | | v2 (Matt): - Take a DF source argument for the DIM instruction emission in the visitors. - Indentation. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* anv: Add a stub for CmdCopyQueryPoolResults on Ivy BridgeJason Ekstrand2016-07-131-0/+13
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Cc: "12.0" <[email protected]>
* i965: fix compiler warnings for 32bit buildTimothy Arceri2016-07-142-26/+26
| | | | Reviewed-by: Matt Turner <[email protected]>
* Revert "gallium: Force blend color to 16-byte alignment"Tim Rowley2016-07-131-11/+1
| | | | | | | | | | | | | | This reverts commit d8d6091a846ac2a40a011d512d6d57f6c8442e6a. Heap allocations may be only 8-byte aligned on 32-bit system, and so having members with 16-byte alignment (such as in the case where pipe_blend_color is embedded in radeonsi's si_context) is undefined behavior which indeed causes crashes when compiled with gcc -O3. Cc: <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96835 Signed-off-by: Tim Rowley <[email protected]> Acked-by: Chuck Atkins <[email protected]>
* isl/state: Add support for handling auxiliary surfacesJason Ekstrand2016-07-132-2/+48
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* isl: Add an auxiliary surface usage enumJason Ekstrand2016-07-131-0/+26
| | | | Reviewed-by: Chad Versace <[email protected]>
* isl: Add support for color control surfacesJason Ekstrand2016-07-136-0/+102
| | | | Reviewed-by: Chad Versace <[email protected]>
* isl: Add support for multisample compression surfacesJason Ekstrand2016-07-133-0/+15
| | | | Reviewed-by: Chad Versace <[email protected]>
* isl: Add support for HiZ surfacesJason Ekstrand2016-07-137-3/+63
| | | | Reviewed-by: Chad Versace <[email protected]>
* isl: Kill off isl_format_layout::bsJason Ekstrand2016-07-137-22/+21
| | | | Reviewed-by: Chad Versace <[email protected]>
* isl: Take bpb rather than bs in tiling_get_infoJason Ekstrand2016-07-132-6/+6
| | | | Reviewed-by: Chad Versace <[email protected]>
* isl: Use bpb in a few places where it's more natural than bsJason Ekstrand2016-07-135-8/+8
| | | | Reviewed-by: Chad Versace <[email protected]>
* isl: Use bpb for determining YUV image paddingJason Ekstrand2016-07-131-1/+1
| | | | | | | When we initially dropped bpb in favor of bs, we accidentally didn't change this one line properly. This brings it back to what it should be. Reviewed-by: Chad Versace <[email protected]>
* isl: Bring back isl_format_layout::bpbJason Ekstrand2016-07-132-2/+4
| | | | | | | | A while ago we got rid of the bits-per-block because we thought we didn't need it. We're about to introduce some very useful 1 and 2-bit formats so we really should be able to handle them again. Reviewed-by: Chad Versace <[email protected]>
* isl: Change the physical size of a W-tile to 128x32Jason Ekstrand2016-07-134-19/+15
| | | | Reviewed-by: Chad Versace <[email protected]>
* isl: Rework the way we define tile sizes.Jason Ekstrand2016-07-132-81/+137
| | | | | | | | | | | This is based on a very long set of discussions between Chad and myself about how we should properly represent HiZ and CCS buffers. The end result of that discussion was that a tiling actually has two different sizes, a logical size in elements, and a physical size in bytes and rows. This commit reworks ISL's pitch and size calculations to work in terms of these two sizes. Reviewed-by: Chad Versace <[email protected]>
* isl: Rework the way we handle surface paddingJason Ekstrand2016-07-131-27/+25
| | | | Reviewed-by: Chad Versace <[email protected]>
* isl: Use ARRAY_PITCH_SPAN_FULL for depth/stencil surfaces on gen7Jason Ekstrand2016-07-131-1/+1
| | | | | | | We helpfully inserted a PRM quotation about how we need to use ARRAY_PITCH_SPAN_FULL and then set it to COMPACT. Oops... Reviewed-by: Chad Versace <[email protected]>
* isl: Stop multiplying height by block sizeJason Ekstrand2016-07-131-2/+2
| | | | | | | | The row pitch already specifies the size of a row of elements. Multiplying by the block height simply causes us to allocate as muc as 12 times more memory than needed for compressed textures. Reviewed-by: Chad Versace <[email protected]>
* isl: Get rid of tiling_get_extentJason Ekstrand2016-07-132-17/+0
| | | | | | It was unused Reviewed-by: Chad Versace <[email protected]>
* nir/spirv: Don't multiply the push constant block size by 4Jason Ekstrand2016-07-131-1/+1
| | | | | | | | | I have no idea why we were multiplying by 4 before. The offsets we get from SPIR-V are in bytes and so is nir->num_uniforms so there's no need to do any adjustment whatsoever. Signed-off-by: Jason Ekstrand <[email protected]> Cc: "12.0" <[email protected]>
* anv/pipeline: Assert that the number of uniforms from NIR fitsJason Ekstrand2016-07-131-0/+1
|
* radeonsi: report accurate SGPR and VGPR spillsMarek Olšák2016-07-132-5/+15
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a workaround for a compute VGPR-usage LLVM bugMarek Olšák2016-07-131-0/+35
| | | | | | | v2: use abort(), describe which LLVM version is affected Cc: 12.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use LLVMGetTypeKind to tell if an input is an array of descriptorsMarek Olšák2016-07-131-19/+11
| | | | | | just a cleanup Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: replace !tbaa with !invariant.loadMarek Olšák2016-07-131-12/+5
| | | | | | no change in generated code thanks to dereferenceable(n) Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set dereferenceable attribute on descriptor arraysMarek Olšák2016-07-131-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows moving the loads arbitrarily in the Sinking pass. 26002 shaders in 14643 tests Totals: SGPRS: 2080160 -> 2080160 (0.00 %) VGPRS: 798875 -> 797826 (-0.13 %) Spilled SGPRs: 108485 -> 79165 (-27.03 %) Spilled VGPRs: 327 -> 327 (0.00 %) Scratch VGPRs: 1656 -> 1652 (-0.24 %) dwords per thread Code Size: 36127192 -> 35559780 (-1.57 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 212464 -> 212672 (0.10 %) Wait states: 0 -> 0 (0.00 %) PERCENTAGES / App Shaders SGPRs VGPRs SpillSGPR SpillVGPR Scratch CodeSize MaxWaves Waits (unknown) 4 . . . . . . . . 0ad 6 . . . . . . . . alien_isolation 2938 . 0.04 % -8.53 % . . -0.71 % -0.06 % . anholt 10 . . . . . . . . batman_arkham_origins 589 . -0.58 % -79.54 % . . -6.72 % 0.57 % . bioshock-infinite 1769 . -0.65 % -89.32 % . . -4.73 % 0.48 % . borderlands2 3968 . -0.31 % -51.21 % . . -4.09 % 0.22 % . brutal-legend 338 . -0.03 % -2.95 % . . -0.06 % . . civilization_beyond.. 116 . . -14.17 % . . -0.88 % . . counter_strike_glob.. 1142 . . . . . . . . dirt-showdown 541 . -0.56 % -40.14 % . -3.45 % -1.82 % 0.35 % . dolphin 22 . . . . . 0.16 % . . dota2 1747 . . . . . 0.01 % . . europa_universalis_4 76 . -0.23 % -42.11 % . . -0.96 % . . f1-2015 774 . -0.09 % -28.89 % . . -2.60 % 0.09 % . furmark-0.7.0 4 . . . . . . . . gimark-0.7.0 10 . . . . . . . . glamor 16 . . . . . . . . humus-celshading 4 . . . . . . . . humus-domino 6 . . . . . . . . humus-dynamicbranching 24 . 0.71 % . . . 0.29 % -0.45 % . humus-hdr 10 . . . . . . . . humus-portals 2 . . . . . . . . humus-volumetricfog.. 6 . . . . . . . . left_4_dead_2 1762 . . . . . . . . metro_2033_redux 2670 . -0.10 % -7.15 % . . -0.03 % . . nexuiz 80 . . . . . . . . pixmark-julia-fp32 2 . . . . . . . . pixmark-julia-fp64 2 . . . . . . . . pixmark-piano-0.7.0 2 . . . . . . . . pixmark-volplosion-.. 2 . . . . . . . . plot3d-0.7.0 8 . . . . . . . . portal 474 . . . . . . . . sauerbraten 7 . . . . . . . . serious_sam_3_bfe 392 . . -13.20 % . . -1.81 % . . supertuxkart 4 . . . . . . . . talos_principle 324 . -0.21 % -18.39 % . . -2.73 % 0.14 % . team_fortress_2 808 . . . . . . . . tesseract 430 . 0.08 % -68.57 % . . -0.45 % . . tessmark-0.7.0 6 . . . . . . . . thea 172 . . . . . 0.03 % . . ue4_effects_cave 299 . -0.04 % -10.15 % . . -0.25 % 0.04 % . ue4_elemental 586 . -0.02 % -13.93 % . . -0.13 % 0.02 % . ue4_lightroom_inter.. 74 . -0.17 % -70.00 % . . -1.27 % . . ue4_realistic_rende.. 92 . . -32.58 % . . -0.35 % . . unigine_heaven 322 . 0.12 % -54.17 % . . -1.42 % -0.12 % . unigine_sanctuary 264 . . . . . . . . unigine_tropics 210 . . . . . . . . unigine_valley 278 . -0.15 % -40.74 % . . -2.00 % 0.09 % . unity 72 . . . . . 0.03 % . . warsow 176 . . . . . . . . warzone2100 4 . . . . . 0.13 % . . witcher2 1040 . -0.03 % -86.28 % . . -0.28 % 0.01 % . xcom_enemy_within 1236 . -0.24 % -63.54 % . . -0.93 % 0.18 % . yofrankie 82 . -0.61 % -100.00 % . . -0.83 % 0.41 % . ----------------------------------------------------------------------------------------------------------- Total 26002 . -0.13 % -27.03 % . -0.24 % -1.57 % 0.10 % . Reviewed-by: Nicolai Hähnle <[email protected]>
* gallivm: add helper lp_add_attr_dereferenceableMarek Olšák2016-07-132-0/+14
| | | | | | | | | Not sure if this is the right way to do it, but it seems to work. v2: make it a no-op on LLVM <= 3.5 Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: clean up shader value metadata codeMarek Olšák2016-07-131-15/+19
| | | | | | | No change in behavior. BTW, tbaa_md_kind == 1, which was the magic number in the code. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove LLVMNoUnwindAttribute usesMarek Olšák2016-07-131-36/+31
| | | | | | always set by gallivm Reviewed-by: Nicolai Hähnle <[email protected]>