summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: add a helper function to create double immediatesIago Toral Quiroga2016-07-132-0/+40
| | | | | | | | | | | | | | | | | | | Gen7 hardware does not support double immediates so these need to be moved in 32-bit chunks to a regular vgrf instead. Instead of doing this every time we need to create a DF immediate, create a helper function that does the right thing depending on the hardware generation. v2: - Define setup_imm_df() as an independent function (Curro) - Create a specific builder to get rid of some instruction field assignments (Curro). v3: - Get devinfo from builder (Kenneth) Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* vc4: Validate QPU uniform pointer updates.Eric Anholt2016-07-121-0/+22
|
* vc4: Add support for NIR loops and break/continue.Eric Anholt2016-07-122-3/+79
|
* vc4: Add support for emitting NIR IF nodes.Eric Anholt2016-07-121-1/+91
|
* vc4: Add support for storing to NIR registers in a non-SSA fashion.Eric Anholt2016-07-122-85/+144
| | | | | | | Previously, there were occasionally NIR registers in our programs, but they were always actually used SSA-only. Now that we're trying to support control flow, we need to actually conditionally move to registers based on whether channels are active or not.
* vc4: Add a flag in the screen to track control flow support.Eric Anholt2016-07-123-1/+14
| | | | | For now it's still always false, but I need it in place for kernel backwards compat support as I extend the backend for control flow.
* vc4: Define a QIR branch instructionEric Anholt2016-07-124-9/+61
| | | | | | This uses the branch condition code in inst->cond to jump to either successor[0] (condition matches) or successor[0] (condition doesn't match).
* vc4: Add kernel support for branching in shader validation.Eric Anholt2016-07-123-17/+280
| | | | | | | | | | | | | | | | | | | | | We're already checking that branch instructions are within the contents of the shader and the proper PROG_END sequence is present. The other thing we need in the presence of branching is to verify that the shader doesn't overflow past the end of the uniforms stream. To do that, we require that at the start of any basic block reading uniforms have the following instructions: load_imm temp, <offset within uniform stream> add unif_addr, temp, unif The instructions are generated by userspace, and the kernel verifies that the load_imm is of the expected offset, and that the add adds it to a uniform. We track which uniform in the stream that is, and at draw call time fix up the uniform stream to have the address of the start of the shader's uniforms for that draw call. Signed-off-by: Eric Anholt <[email protected]>
* vc4: Add a bitmap of branch targets in kernel validation.Eric Anholt2016-07-123-2/+133
| | | | | | This isn't used yet, it's just a first step toward loop validation. During the main parsing of instructions, we need to know when we hit a new basic block so that we can reset validated state.
* vc4: Track the current instruction into the validation_state.Eric Anholt2016-07-121-24/+30
| | | | | This reduces how much we need to pass around as arguments, which was becoming more of a problem with looping validation.
* vc4: Add QPU support for generating BRANCH instructions.Eric Anholt2016-07-125-1/+85
|
* vc4: Print live variable start/ends during QIR dumping.Eric Anholt2016-07-121-0/+45
| | | | | This only happens when live variables are set up, which is not in the normal dump, but is set up when we've failed to register allocate.
* vc4: Implement live intervals using a CFG.Eric Anholt2016-07-126-39/+393
| | | | | Right now our CFG is always a trivial single basic block, but that will change when enable loops.
* vc4: Make vc4_qir_schedule handle each block in the program.Eric Anholt2016-07-121-14/+23
| | | | | | | | Basically we just treat each block independently. The only inter-block scheduling I can think of that would be be interesting would be to move texture result collection to after a short loop/if block that doesn't do texturing. However, the kernel disallows that as part of its security validation.
* vc4: Convert uniforms lowering to work with multiple blocks.Eric Anholt2016-07-121-29/+44
| | | | | | | | | We still decide which uniform to lower based on how many instructions-that-need-lowering use that uniform, but now we emit a new temporary uniform load in each of the basic blocks containing an instruction being lowered. This commit is best reviewed with diff -b.
* vc4: Convert vc4_opt_peephole_sf to work with control flow.Eric Anholt2016-07-121-4/+18
| | | | | | We need to apply the peephole pass to each of the blocks in the program. We don't do dataflow analysis for SF across blocks, but we also don't generate code that would need us to do so.
* vc4: Create a basic block structure and move the instructions into it.Eric Anholt2016-07-126-20/+122
| | | | | | | The optimization passes and scheduling aren't actually ready for multiple blocks with control flow yet (as seen by the "cur_block" references in them instead of iterating over blocks), but this creates the structures necessary for converting them.
* vc4: Add a "qir_for_each_inst_inorder" macro and use it in many places.Eric Anholt2016-07-1212-14/+17
| | | | | | | | We have the prior list_foreach() all over the code, but I need to move where instructions live as part of adding support for control flow. Start by just converting to a helper iterator macro. (The simpler "qir_for_each_inst()" will be used for the for-each-inst-in-a-block iterator macro later)
* vc4: Also enable phi elimination.Eric Anholt2016-07-121-0/+1
| | | | | | | | This avoids a bunch of code gen regressions when enabling loops in vc4. Prior to that, the GLSL that would have generated these optimizable phi nodes was being lowered to csels between either (undef, a) or (a, a), and those were being dealt with by nir_opt_undef and nir_opt_algebraic.
* vc4: fix memory leakEric Engestrom2016-07-121-1/+1
| | | | | | | | The allocation has succeeded by that point, so it needs to be freed. CovID: 1358929 Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Close our screen's fd on screen close.Eric Anholt2016-07-121-0/+3
| | | | | We're passed in a freshly dup()ed fd on screen create, so we should close it on exit. Debugged by Hugh Cole-Baker.
* nir: Add optimization for (a || True == True)Eric Anholt2016-07-121-0/+1
| | | | | | | | | | | | This was appearing in vc4 VS/CS in mupen64, due to vertex attrib lowering producing some constants that were getting compared. total instructions in shared programs: 112276 -> 112198 (-0.07%) instructions in affected programs: 2239 -> 2161 (-3.48%) total estimated cycles in shared programs: 283102 -> 283038 (-0.02%) estimated cycles in affected programs: 2365 -> 2301 (-2.71%) Reviewed-by: Jason Ekstrand <[email protected]>
* swr: [rasterizer core] correct MSAA behavior for conservative rasterizationTim Rowley2016-07-123-11/+31
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] conservative rast backend changesTim Rowley2016-07-128-221/+538
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer] buckets cleanupTim Rowley2016-07-124-12/+43
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] make all api functions call GetContextTim Rowley2016-07-121-14/+14
| | | | | | | | Small api cleanup. Make all api functions call GetContext instead of locally casting handle. Makes debugging easier by providing a single point to track context changes. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer] add support for llvm-3.9Tim Rowley2016-07-122-15/+28
| | | | | | v2: use signed compare, remove unneeded vmask Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] fix llvm-3.7 compileTim Rowley2016-07-121-0/+5
| | | | | | | d3d97f8 broke llvm-3.7, which has a mismatched API for setDataLayout/getDataLayout. Signed-off-by: Tim Rowley <[email protected]>
* docs: remove duplicated line in 12.0.1 release notes fileBrian Paul2016-07-121-2/+0
| | | | Signed-off-by: Brian Paul <[email protected]>
* st/omx/dec: convert decoder video buffer to progressiveLeo Liu2016-07-122-3/+68
| | | | | | | | | | | | | | | | | | | | | | with encode tunneling The idea of encode tunneling is to use video buffer directly for encoder, but currently the encoder doesn’t support interlaced surface, the OMX decoder set progressive surface before on that purpose. Since now we are polling the driver for interlacing information for decoder, we got the interlaced as preferred as other APIs(VDPAU, VA-API), thus breaking the transcode with tunneling. The solution is when with tunnel detected, re-allocate progressive target buffers, and then converting the interlaced decoder results to there. This has been tested with transcode results bit to bit matching as before with surface from progressive to progressive. Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]> Tested-by: Julien Isorce <[email protected]>
* vl/compositor: set layer of y or uv to renderLeo Liu2016-07-122-0/+42
| | | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]> Tested-by: Julien Isorce <[email protected]>
* vl/compositor: add weave to yuv shaderLeo Liu2016-07-122-0/+43
| | | | | | | | This shader will make interlaced yuv to progressive yuv. Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]> Tested-by: Julien Isorce <[email protected]>
* vl/compositor: move weave shader out from rgb weavingLeo Liu2016-07-122-76/+83
| | | | | | | | We'll use weave shader in the later patch. Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]> Tested-by: Julien Isorce <[email protected]>
* glsl_to_tgsi: don't use the negate modifier in integer ops after bitcastMarek Olšák2016-07-121-5/+7
| | | | | | | | This bug is uncovered by glsl/lower_if_to_cond_assign. I don't know if it can be reproduced in any other way. Cc: <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* clover/api: Implement clLinkProgram per-device binary presence validation rule.Francisco Jerez2016-07-111-2/+31
| | | | | Reviewed-by: Serge Martin <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover: Add clLinkProgram (CL 1.2).Serge Martin2016-07-111-4/+27
| | | | | | | | | [ Francisco Jerez: Use validate_build_common for error checking, simplify control flow slightly and handle additional exception types. ] Reviewed-by: Francisco Jerez <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover: Trivial cleanups for api/program.cpp.Francisco Jerez2016-07-111-9/+8
| | | | | Reviewed-by: Serge Martin <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover/core: Remove compiler.hpp.Francisco Jerez2016-07-114-37/+3
| | | | | | | | header_map was the only definition left in compiler.hpp, move it into program.hpp which is its only user in clover/core. Reviewed-by: Serge Martin <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover/llvm: Get rid of compile_program_llvm().Francisco Jerez2016-07-112-18/+0
| | | | | | | Superseded by compile_program() and link_program(). Reviewed-by: Serge Martin <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover: Provide separate program methods for compilation and linking.Francisco Jerez2016-07-113-12/+42
| | | | | | | | [ Serge Martin: Fix inverted opts and log build ctor args. Keep the log related to the build. Fix indentation ] Reviewed-by: Serge Martin <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover: Unify program::build_* into a single method returning a struct.Francisco Jerez2016-07-114-50/+39
| | | | | | | | | | | | This gets rid of the program::build_* query methods and replaces them with the program::build() method that returns a single data structure containing all parameters for the last build done on the given target device (including build logs, options and the binary itself). [ Serge Martin: Fix inverted opts and log build ctor args ] Reviewed-by: Serge Martin <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover: Change program::build opts argument to std::string.Serge Martin2016-07-112-3/+3
| | | | | Reviewed-by: Francisco Jerez <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover: Define error subclass to signal build option parse failure.Francisco Jerez2016-07-113-3/+11
| | | | | Reviewed-by: Serge Martin <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover: Move back to using build_error to signal compilation failure.Francisco Jerez2016-07-115-17/+17
| | | | | | | | | | | | | | This partially reverts 7e0180d57d330bd8d3047e841086712376b2a1cc. Having two different exception subclasses for compilation and linking makes it more difficult to share or move code between the two codepaths, because the exact same function under the same error condition would need to throw one exception or the other depending on what top-level API is being implemented with it. There is little benefit anyway because clCompileProgram() and clLinkProgram() can tell whether they are linking or compiling a program. Reviewed-by: Serge Martin <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover: Override ret_object.Serge Martin2016-07-111-0/+11
| | | | | | | | Return an API object from an intrusive reference to a Clover object, incrementing the reference count of the object. Reviewed-by: Francisco Jerez <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover/tgsi: Add stub link_program() function.Francisco Jerez2016-07-112-0/+9
| | | | | Reviewed-by: Serge Martin <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover/tgsi: Move compiler entry point declaration into tgsi directory and ↵Francisco Jerez2016-07-115-7/+42
| | | | | | | namespace. Reviewed-by: Serge Martin <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover/llvm: Implement the -create-library linker option.Francisco Jerez2016-07-112-24/+44
| | | | | | | | [ Serge Martin: disable internalize pass when building a library. Otherwise some functions may be inlined and removed ] Reviewed-by: Serge Martin <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover/llvm: Implement linkage of multiple clover modules.Francisco Jerez2016-07-112-3/+35
| | | | | Reviewed-by: Serge Martin <[email protected]> Tested-by: Jan Vesely <[email protected]>
* clover/llvm: Split compilation and linking.Francisco Jerez2016-07-113-15/+91
| | | | | | | | | | | | Split the work previously done by compile_program_llvm() into compile_program() (which simply runs the front-end and serializes the resulting LLVM IR) and link_program() (which takes care of everything else down to binary codegen). [ Serge Martin: allow LLVM IR dump after compilation ] Reviewed-by: Serge Martin <[email protected]> Tested-by: Jan Vesely <[email protected]>