Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | nv50/ir: set position before i instead of i->next in NV50LoweringPreSSA::visit | Bryan Cain | 2012-07-20 | 1 | -7/+2 |
| | | | | | Fixes rendering glitches in Psychonauts such as Raz's eyes flickering white. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=51962. | ||||
* | nv50/ir: make colorful ir dump output optional | Marcin Slusarz | 2012-06-28 | 1 | -5/+17 |
| | |||||
* | nv50/ir: handle NEG,ABS modifiers for short RCP encoding | Christoph Bumiller | 2012-06-14 | 1 | -0/+2 |
| | |||||
* | nvc0/ir: allow 64-bit constant loads on nve4 | Christoph Bumiller | 2012-05-29 | 1 | -0/+2 |
| | | | | Looks like only 128-bit access doesn't work. | ||||
* | nvc0/ir: fix texture barrier insertion to prevent WAW hazards | Christoph Bumiller | 2012-05-29 | 4 | -8/+10 |
| | | | | Fixes, for instance, object highlighting in Diablo 3 (wine). | ||||
* | nvc0/ir: TEX doesn't support JOIN modifier either | Christoph Bumiller | 2012-05-29 | 1 | -0/+1 |
| | |||||
* | nv50/ir: fix reversed order of lane ops in quadops | Christoph Bumiller | 2012-05-17 | 1 | -2/+3 |
| | |||||
* | gallium/tgsi: s/TGSI_BUFFER/TGSI_TEXTURE_BUFFER/ | José Fonseca | 2012-05-11 | 1 | -2/+1 |
| | | | | | | For consistency. Reviewed-by: Brian Paul <[email protected]> | ||||
* | gallium/tgsi: Redefine the TGSI_TEXTURE_UNKNOWN texture target. | José Fonseca | 2012-05-11 | 1 | -0/+2 |
| | | | | | | | | | Some code relies on the existing of an invalid texture target. It seems safer to bring it back than to deal with unintended consequences. This partially reverts commit a4ebb04214bab1cd9bd41967232ec89441e31744. Reviewed-by: Brian Paul <[email protected]> | ||||
* | gallium/tgsi: Define the TGSI_BUFFER texture target. | Francisco Jerez | 2012-05-11 | 1 | -2/+2 |
| | | | | | | This texture type was already referred to by the documentation but it was never defined. Define it as 0 to match the pipe_texture_target enumeration values. | ||||
* | gallium/tgsi: Move interpolation info from tgsi_declaration to a separate token. | Francisco Jerez | 2012-05-11 | 1 | -2/+2 |
| | | | | | | Move Interpolate, Centroid and CylindricalWrap from tgsi_declaration to a separate token -- they only make sense for FS inputs and we need room for other flags in the top-level declaration token. | ||||
* | gallium/tgsi: Split sampler views from shader resources. | Francisco Jerez | 2012-05-11 | 1 | -16/+18 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit splits the current concept of resource into "sampler views" and "shader resources": "Sampler views" are textures or buffers that are bound to a given shader stage and can be read from in conjunction with a sampler object. They are analogous to OpenGL texture objects or Direct3D SRVs. "Shader resources" are textures or buffers that can be read and written from a shader. There's no support for floating point coordinates, address wrap modes or filtering, and, unlike sampler views, shader resources are global for the whole graphics pipeline. They are analogous to OpenGL image objects (as in ARB_shader_image_load_store) or Direct3D UAVs. Most hardware is likely to implement shader resources and sampler views as separate objects, so, having the distinction at the API level simplifies things slightly for the driver. This patch introduces the SVIEW register file with a declaration token and syntax analogous to the already existing RES register file. After this change, the SAMPLE_* opcodes no longer accept a resource as input, but rather a SVIEW object. To preserve the functionality of reading from a sampler view with integer coordinates, the SAMPLE_I(_MS) opcodes are introduced which are similar to LOAD(_MS) but take a SVIEW register instead of a RES register as argument. | ||||
* | nv50/ir/opt: don't lose saturation in tryCollapseChainedMULs | Christoph Bumiller | 2012-05-06 | 1 | -2/+3 |
| | |||||
* | nvc0/ir: fix lowering of textureGrad | Christoph Bumiller | 2012-05-06 | 1 | -4/+4 |
| | |||||
* | nv50/ir: move expansion of IMUL to later stage and handle memory operands | Christoph Bumiller | 2012-05-04 | 4 | -17/+51 |
| | |||||
* | nv50: enable array textures | Christoph Bumiller | 2012-05-04 | 1 | -1/+2 |
| | |||||
* | nvc0/ir/opt: INTERP does not support JOIN | Christoph Bumiller | 2012-04-29 | 1 | -0/+2 |
| | |||||
* | nv50/ir/opt: try to convert ABS(SUB) to SAD | Christoph Bumiller | 2012-04-29 | 5 | -15/+162 |
| | |||||
* | nvc0/ir: initial implementation of nve4 scheduling hints | Christoph Bumiller | 2012-04-29 | 5 | -4/+141 |
| | |||||
* | nvc0/ir: implement better placement of texture barriers | Christoph Bumiller | 2012-04-29 | 7 | -6/+58 |
| | | | | | Put them before first uses instead of right after the texturing instruction and cull unnecessary barriers. | ||||
* | nv50/ir/tgsi: fix handling of early RET | Christoph Bumiller | 2012-04-29 | 1 | -4/+5 |
| | | | | We have to actually emit RET, too, of course, not just the PRERET. | ||||
* | nv50/ir/opt: swap VP inputs to first source where possible | Christoph Bumiller | 2012-04-19 | 1 | -0/+17 |
| | |||||
* | nvc0: add initial support for nve4+ (Kepler) chipsets | Christoph Bumiller | 2012-04-15 | 6 | -4/+11 |
| | | | | | | | | | Most things that work on Fermi should work on Kepler too. There are a few performance optimizations left to do, like better placement of texture barriers and adding scheduling data to the shader instructions (without them, a thread group will be masked for 32 cycles after each single instruction issue). | ||||
* | nv50/ir/opt: extend handleCVT for nv50's SET u32 to f32 chain | Christoph Bumiller | 2012-04-14 | 1 | -1/+17 |
| | |||||
* | nv50/ir: print interpolation mode | Christoph Bumiller | 2012-04-14 | 1 | -0/+22 |
| | |||||
* | nv50: hook up to new shader code generator | Christoph Bumiller | 2012-04-14 | 3 | -0/+5 |
| | |||||
* | nv50/ir: import nv50 target | Christoph Bumiller | 2012-04-14 | 11 | -219/+2472 |
| | |||||
* | nv50/ir: fix off-by-ones in CSE and nvc0 insnCanLoad | Christoph Bumiller | 2012-04-14 | 1 | -1/+1 |
| | |||||
* | nv50/ir/tgsi: generate UCPs with actual outputs instead of SVs | Christoph Bumiller | 2012-04-14 | 1 | -4/+20 |
| | | | | | gl_ClipDistance is treated the same way, this is just nicer and easier assign slots for them on nv50. | ||||
* | nv50/ir: Fix type of the instruction created by mkCmp() for dst in FILE_FLAGS. | Francisco Jerez | 2012-04-14 | 1 | -1/+2 |
| | |||||
* | nv50/ir: fix Instruction::isCommutationLegal for WAW | Christoph Bumiller | 2012-04-14 | 1 | -4/+14 |
| | |||||
* | nv50/ir/opt: Add isOptSupported() check in logical arith optimization. | Francisco Jerez | 2012-04-14 | 1 | -8/+5 |
| | |||||
* | nv50/ir/ra: Fix live set propagation in the secondary passes of buildLiveSets(). | Francisco Jerez | 2012-04-14 | 1 | -3/+3 |
| | |||||
* | nv50/ir/opt: don't regard OP_WRSV as dead code | Christoph Bumiller | 2012-04-14 | 1 | -1/+2 |
| | |||||
* | nv50/ir: add isUniform query to Values | Christoph Bumiller | 2012-04-14 | 2 | -0/+24 |
| | |||||
* | nv50/ir: rewrite the register allocator as GCRA, with spilling | Christoph Bumiller | 2012-04-14 | 10 | -414/+1473 |
| | | | | | This is more flexible than the linear scan, and we don't need the separate allocation pass for constrained values anymore. | ||||
* | nv50/ir/tgsi: only export x-component of PSIZE | Christoph Bumiller | 2012-04-14 | 1 | -1/+5 |
| | |||||
* | nv50/ir: Fix BuildUtil::mkSelect and mkClobber | Francisco Jerez | 2012-04-14 | 1 | -6/+2 |
| | |||||
* | nv50/ir: fix reg file conflicts with undefined-value placeholders | Christoph Bumiller | 2012-04-14 | 1 | -10/+19 |
| | |||||
* | nv50/ir/opt: silence warning (int < Elements() signedness) | Christoph Bumiller | 2012-04-14 | 1 | -1/+1 |
| | |||||
* | nv50/ir/opt: fix combineSt access to wrong instruction | Christoph Bumiller | 2012-04-14 | 1 | -1/+1 |
| | |||||
* | nv50/ir/opt: another insn NULL check in phi elimination | Christoph Bumiller | 2012-04-14 | 1 | -0/+2 |
| | |||||
* | nv50/ir/ssa: Take into account function inputs and outputs. | Francisco Jerez | 2012-04-14 | 1 | -2/+30 |
| | |||||
* | nv50/ir: Clean up before calculating instruction ordering for a new function. | Francisco Jerez | 2012-04-14 | 2 | -0/+16 |
| | |||||
* | nv50/ir/ra: Allocate registers for function arguments. | Francisco Jerez | 2012-04-14 | 1 | -0/+6 |
| | |||||
* | nv50/ir: Take into account function args in the live range calculation code. | Francisco Jerez | 2012-04-14 | 2 | -3/+28 |
| | |||||
* | nv50/ir/ra: Use matching physical regs for function args in caller and callee. | Francisco Jerez | 2012-04-14 | 1 | -6/+83 |
| | |||||
* | nv50/ir/tgsi: Infer function inputs/outputs. | Francisco Jerez | 2012-04-14 | 2 | -0/+87 |
| | | | | | | | Edit: Don't do it for the main function of (graphics) shaders, its inputs and outputs always go through TGSI_FILE_INPUT/OUTPUT. This prevents all TEMPs from counting as live out and reduces register pressure. | ||||
* | nv50/ir/tgsi: Replace the inlining logic with proper function calls. | Francisco Jerez | 2012-04-14 | 5 | -68/+82 |
| | |||||
* | nv50/ir: Decouple DataArray from the dictionary that maps locations to values. | Francisco Jerez | 2012-04-14 | 4 | -223/+236 |
| | | | | | | | | | | | The point is to keep an independent dictionary for each function. The array that was being used as dictionary has been converted into a "bimap" for two different reasons: first, because having an almost empty instance of an array with as many entries as registers there are in the program, once for every function, would be wasteful, and second, because we want to be able to map Value pointers back to locations at some point. |