summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nv50/codegen
Commit message (Collapse)AuthorAgeFilesLines
* nv50/ir: set position before i instead of i->next in NV50LoweringPreSSA::visitBryan Cain2012-07-201-7/+2
| | | | | Fixes rendering glitches in Psychonauts such as Raz's eyes flickering white. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=51962.
* nv50/ir: make colorful ir dump output optionalMarcin Slusarz2012-06-281-5/+17
|
* nv50/ir: handle NEG,ABS modifiers for short RCP encodingChristoph Bumiller2012-06-141-0/+2
|
* nvc0/ir: allow 64-bit constant loads on nve4Christoph Bumiller2012-05-291-0/+2
| | | | Looks like only 128-bit access doesn't work.
* nvc0/ir: fix texture barrier insertion to prevent WAW hazardsChristoph Bumiller2012-05-294-8/+10
| | | | Fixes, for instance, object highlighting in Diablo 3 (wine).
* nvc0/ir: TEX doesn't support JOIN modifier eitherChristoph Bumiller2012-05-291-0/+1
|
* nv50/ir: fix reversed order of lane ops in quadopsChristoph Bumiller2012-05-171-2/+3
|
* gallium/tgsi: s/TGSI_BUFFER/TGSI_TEXTURE_BUFFER/José Fonseca2012-05-111-2/+1
| | | | | | For consistency. Reviewed-by: Brian Paul <[email protected]>
* gallium/tgsi: Redefine the TGSI_TEXTURE_UNKNOWN texture target.José Fonseca2012-05-111-0/+2
| | | | | | | | | Some code relies on the existing of an invalid texture target. It seems safer to bring it back than to deal with unintended consequences. This partially reverts commit a4ebb04214bab1cd9bd41967232ec89441e31744. Reviewed-by: Brian Paul <[email protected]>
* gallium/tgsi: Define the TGSI_BUFFER texture target.Francisco Jerez2012-05-111-2/+2
| | | | | | This texture type was already referred to by the documentation but it was never defined. Define it as 0 to match the pipe_texture_target enumeration values.
* gallium/tgsi: Move interpolation info from tgsi_declaration to a separate token.Francisco Jerez2012-05-111-2/+2
| | | | | | Move Interpolate, Centroid and CylindricalWrap from tgsi_declaration to a separate token -- they only make sense for FS inputs and we need room for other flags in the top-level declaration token.
* gallium/tgsi: Split sampler views from shader resources.Francisco Jerez2012-05-111-16/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit splits the current concept of resource into "sampler views" and "shader resources": "Sampler views" are textures or buffers that are bound to a given shader stage and can be read from in conjunction with a sampler object. They are analogous to OpenGL texture objects or Direct3D SRVs. "Shader resources" are textures or buffers that can be read and written from a shader. There's no support for floating point coordinates, address wrap modes or filtering, and, unlike sampler views, shader resources are global for the whole graphics pipeline. They are analogous to OpenGL image objects (as in ARB_shader_image_load_store) or Direct3D UAVs. Most hardware is likely to implement shader resources and sampler views as separate objects, so, having the distinction at the API level simplifies things slightly for the driver. This patch introduces the SVIEW register file with a declaration token and syntax analogous to the already existing RES register file. After this change, the SAMPLE_* opcodes no longer accept a resource as input, but rather a SVIEW object. To preserve the functionality of reading from a sampler view with integer coordinates, the SAMPLE_I(_MS) opcodes are introduced which are similar to LOAD(_MS) but take a SVIEW register instead of a RES register as argument.
* nv50/ir/opt: don't lose saturation in tryCollapseChainedMULsChristoph Bumiller2012-05-061-2/+3
|
* nvc0/ir: fix lowering of textureGradChristoph Bumiller2012-05-061-4/+4
|
* nv50/ir: move expansion of IMUL to later stage and handle memory operandsChristoph Bumiller2012-05-044-17/+51
|
* nv50: enable array texturesChristoph Bumiller2012-05-041-1/+2
|
* nvc0/ir/opt: INTERP does not support JOINChristoph Bumiller2012-04-291-0/+2
|
* nv50/ir/opt: try to convert ABS(SUB) to SADChristoph Bumiller2012-04-295-15/+162
|
* nvc0/ir: initial implementation of nve4 scheduling hintsChristoph Bumiller2012-04-295-4/+141
|
* nvc0/ir: implement better placement of texture barriersChristoph Bumiller2012-04-297-6/+58
| | | | | Put them before first uses instead of right after the texturing instruction and cull unnecessary barriers.
* nv50/ir/tgsi: fix handling of early RETChristoph Bumiller2012-04-291-4/+5
| | | | We have to actually emit RET, too, of course, not just the PRERET.
* nv50/ir/opt: swap VP inputs to first source where possibleChristoph Bumiller2012-04-191-0/+17
|
* nvc0: add initial support for nve4+ (Kepler) chipsetsChristoph Bumiller2012-04-156-4/+11
| | | | | | | | | Most things that work on Fermi should work on Kepler too. There are a few performance optimizations left to do, like better placement of texture barriers and adding scheduling data to the shader instructions (without them, a thread group will be masked for 32 cycles after each single instruction issue).
* nv50/ir/opt: extend handleCVT for nv50's SET u32 to f32 chainChristoph Bumiller2012-04-141-1/+17
|
* nv50/ir: print interpolation modeChristoph Bumiller2012-04-141-0/+22
|
* nv50: hook up to new shader code generatorChristoph Bumiller2012-04-143-0/+5
|
* nv50/ir: import nv50 targetChristoph Bumiller2012-04-1411-219/+2472
|
* nv50/ir: fix off-by-ones in CSE and nvc0 insnCanLoadChristoph Bumiller2012-04-141-1/+1
|
* nv50/ir/tgsi: generate UCPs with actual outputs instead of SVsChristoph Bumiller2012-04-141-4/+20
| | | | | gl_ClipDistance is treated the same way, this is just nicer and easier assign slots for them on nv50.
* nv50/ir: Fix type of the instruction created by mkCmp() for dst in FILE_FLAGS.Francisco Jerez2012-04-141-1/+2
|
* nv50/ir: fix Instruction::isCommutationLegal for WAWChristoph Bumiller2012-04-141-4/+14
|
* nv50/ir/opt: Add isOptSupported() check in logical arith optimization.Francisco Jerez2012-04-141-8/+5
|
* nv50/ir/ra: Fix live set propagation in the secondary passes of buildLiveSets().Francisco Jerez2012-04-141-3/+3
|
* nv50/ir/opt: don't regard OP_WRSV as dead codeChristoph Bumiller2012-04-141-1/+2
|
* nv50/ir: add isUniform query to ValuesChristoph Bumiller2012-04-142-0/+24
|
* nv50/ir: rewrite the register allocator as GCRA, with spillingChristoph Bumiller2012-04-1410-414/+1473
| | | | | This is more flexible than the linear scan, and we don't need the separate allocation pass for constrained values anymore.
* nv50/ir/tgsi: only export x-component of PSIZEChristoph Bumiller2012-04-141-1/+5
|
* nv50/ir: Fix BuildUtil::mkSelect and mkClobberFrancisco Jerez2012-04-141-6/+2
|
* nv50/ir: fix reg file conflicts with undefined-value placeholdersChristoph Bumiller2012-04-141-10/+19
|
* nv50/ir/opt: silence warning (int < Elements() signedness)Christoph Bumiller2012-04-141-1/+1
|
* nv50/ir/opt: fix combineSt access to wrong instructionChristoph Bumiller2012-04-141-1/+1
|
* nv50/ir/opt: another insn NULL check in phi eliminationChristoph Bumiller2012-04-141-0/+2
|
* nv50/ir/ssa: Take into account function inputs and outputs.Francisco Jerez2012-04-141-2/+30
|
* nv50/ir: Clean up before calculating instruction ordering for a new function.Francisco Jerez2012-04-142-0/+16
|
* nv50/ir/ra: Allocate registers for function arguments.Francisco Jerez2012-04-141-0/+6
|
* nv50/ir: Take into account function args in the live range calculation code.Francisco Jerez2012-04-142-3/+28
|
* nv50/ir/ra: Use matching physical regs for function args in caller and callee.Francisco Jerez2012-04-141-6/+83
|
* nv50/ir/tgsi: Infer function inputs/outputs.Francisco Jerez2012-04-142-0/+87
| | | | | | | Edit: Don't do it for the main function of (graphics) shaders, its inputs and outputs always go through TGSI_FILE_INPUT/OUTPUT. This prevents all TEMPs from counting as live out and reduces register pressure.
* nv50/ir/tgsi: Replace the inlining logic with proper function calls.Francisco Jerez2012-04-145-68/+82
|
* nv50/ir: Decouple DataArray from the dictionary that maps locations to values.Francisco Jerez2012-04-144-223/+236
| | | | | | | | | | | The point is to keep an independent dictionary for each function. The array that was being used as dictionary has been converted into a "bimap" for two different reasons: first, because having an almost empty instance of an array with as many entries as registers there are in the program, once for every function, would be wasteful, and second, because we want to be able to map Value pointers back to locations at some point.