aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* vc4: Actually clear the context's dirty flags.Eric Anholt2014-10-101-0/+1
| | | | | I was trying to skip state updates when !dirty, and suspiciously everything was always dirty.
* vc4: Optimize the other case of SEL_X_Y wih a 0 -> SEL_X_0(a).Eric Anholt2014-10-101-1/+23
| | | | Cleans up some output to be more obvious in a piglit test I'm looking at.
* vc4: Optimize out adds of 0.Eric Anholt2014-10-091-0/+26
|
* vc4: Optimize fmul(x, 0) and fmul(x, 1).Eric Anholt2014-10-091-0/+45
| | | | | This was being generated frequently by matrix multiplies of 2 and 3-channel vertex attributes (which have the 0 or 1 loaded in the shader).
* vc4: Factor out the turn-it-into-a-mov in opt_algebraic.Eric Anholt2014-10-091-10/+12
| | | | This will be used more in the next commits.
* vc4: Eliminate unused texture instructions.Eric Anholt2014-10-091-1/+21
|
* vc4: Dead code eliminate unused SF instructions.Eric Anholt2014-10-091-7/+26
|
* vc4: Prevent copy propagating out the MOVs from r4.Eric Anholt2014-10-091-1/+11
| | | | | | | | | Copy propagating these might result in reading the r4 after some other instruction has written r4. Just prevent all copy propagation of this for now. Fixes bad rendering with upcoming indirect register access support, where the copy propagation was consistently happening across another read.
* vc4: Split the coordinate shader to its own vc4_compiled_shader.Eric Anholt2014-10-093-89/+54
| | | | | | | | | | | Merging VS and CS into the same struct wasn't winning us anything except for not allocating a separate BO (but if we want to pack programs into BOs, we should pack not just those 2 programs together). What it was getting us was a bunch of code duplication about hash table lookups and propagating vc4_compile contents into a vc4_compiled_shader. I was about to make the situation worse with indirect uniform buffer access.
* vc4: Add #defines for the texture uniform fields.Eric Anholt2014-10-092-19/+113
| | | | | | I wanted to make another set of texture uploads for handling reladdr constants, and duplicating all the bitshifting looked like a terrible idea. In the process, this fixes a swap of the s/t texture wrap modes.
* vc4: Initialize undefined temporaries to 0.Eric Anholt2014-10-091-1/+6
| | | | | | | | | | Under the simulator, reading registers before writing them triggers an assertion failure. c->undef gets treated as r0, which will usually be written, but not if it's used in the first instruction. We should definitely not be aborting in this case, and return some sort of undefined value instead. Fixes glsl-user-varying-ff.
* r600g,radeonsi: Always use GTT again for PIPE_USAGE_STREAM buffersMichel Dänzer2014-10-091-1/+3
| | | | | | | | | Putting those in VRAM can cause long pauses due to buffers being moved into / out of VRAM. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84662 Cc: [email protected] Reviewed-by: Alex Deucher <[email protected]>
* vc4: Optimize SF(ITOF(x)) -> SF(x).Eric Anholt2014-10-091-0/+16
| | | | | This is a common production of st_glsl_to_tgsi, because CMP takes a float argument.
* vc4: Add some optimization of FADD(FSUB(0, x)).Eric Anholt2014-10-091-0/+31
| | | | | This is a common production of st_glsl_to_tgsi, which uses negate flags on source arguments to handle subtraction.
* vc4: Mostly fix offset calculation for NPOT mipmap levels.Eric Anholt2014-10-092-3/+23
| | | | | | | | | | | | | | The non-base NPOT levels are stored as POT-aligned images. We get that POT alignment by minifying the POT-aligned base level. This means that level strides are also POT aligned, so we have to tell the rendering mode config that our resource is larger than the actual requested area. Fixes the fbo-generatemipmap-formats NPOT cases. Regresses depthstencil-render-miplevels 273 * -- the texture presentation now works (where it was completely broken before), it looks like there's some overflow of image bounds happening at the lower miplevels.
* vc4: Move the mirrored kernel code to a kernel/ directory.Eric Anholt2014-10-0911-258/+382
| | | | Now this whole setup matches the kernel's file layout much more closely.
* vc4: Enable LIT lowering in TGSI instead of our own code.Eric Anholt2014-10-081-35/+1
| | | | This brings us the -128/128 clamping on the w component.
* vc4: Fix scalar math opcodes to replicate their result from the X channel.Eric Anholt2014-10-081-4/+16
| | | | | Thanks to robclark for pointing out that I was probably failing to do this when I reported a "bug" in his lowering code.
* ilo: fix rectlist on GEN7+Chia-I Wu2014-10-091-0/+3
| | | | | | It was broken by 343b014b57ecc5431477e090100e6a26edbda540. Signed-off-by: Chia-I Wu <[email protected]>
* vc4: Add support for two-sided color.Eric Anholt2014-10-082-18/+51
| | | | | | | | | | It's fairly easy, thanks to Rob Clark's lowering code. Fixes two-sided-lighting and 4 vertex-program-two-side testcases, while regressing 8 testcases that involve enabling two-sided color while only initializing one of the two colors in the VS. If you're enabling two sided color, it's of course expected that you really do set up both colors, so this is still an improvement (and when we set up a linker for TGSI, we'll hopefully fix those 8 fails).
* vc4: Enable POW lowering in TGSI instead of our own code.Eric Anholt2014-10-081-11/+1
|
* vc4: Enable DP lowering in TGSI instead of our own code.Eric Anholt2014-10-081-41/+3
|
* vc4: Start using tgsi_lowering for opcodes we haven't supported before.Eric Anholt2014-10-081-1/+15
|
* gallium: Rename freedreno parts of tgsi_lowering.[ch].Eric Anholt2014-10-083-31/+32
| | | | Acked-by: Rob Clark <[email protected]>
* gallium: Reformat tgsi_lowering.c for the normal style.Eric Anholt2014-10-082-1204/+1201
| | | | Acked-by: Rob Clark <[email protected]>
* gallium: Copy fd_lowering.[ch] to tgsi_lowering.[ch] for code sharing.Eric Anholt2014-10-082-0/+1662
| | | | | | | | Lots of drivers need to transform the weird instructions in TGSI into reasonable scalar ops, and this code can make those translations canonical. Acked-by: Rob Clark <[email protected]>
* vc4: Set unused raddr fields to QPU_R_NOP.Eric Anholt2014-10-081-16/+27
| | | | | | | The simulator assertion fails if you have a write to a reg and then a read (for example, in the NOP side of an instruction), even if the read isn't used for anything. By setting unused raddrs to NOP, we avoid the problem (since only the phsyical registers are tracked).
* vc4: Abstract out the field-merging logic for instructions.Eric Anholt2014-10-081-11/+17
| | | | I'm going to be doing the same logic for some more fields next.
* r600: Use DMA transfers in r600_copy_global_bufferNiels Ole Salscheider2014-10-072-17/+43
| | | | | | v2: Do not demote items that are already in the pool Signed-off-by: Niels Ole Salscheider <[email protected]>
* radeonsi: Use dummy pixel shader if compilation of the real shader failedMichel Dänzer2014-10-073-7/+22
| | | | | | | Instead of crashing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79155#c5 Reviewed-by: Marek Olšák <[email protected]>
* ilo: let shaders determine surface countsChia-I Wu2014-10-069-202/+267
| | | | | | | | When a shader needs N surfaces, we should upload N surfaces and not depend on how many are bound. This commit is larger than it should be because we did not export how many surfaces a surface uses before. Signed-off-by: Chia-I Wu <[email protected]>
* ilo: let shaders determine sampler countsChia-I Wu2014-10-0413-87/+98
| | | | | | | When a shader needs N samplers, we should upload N samplers and not depend on how many are bound. Signed-off-by: Chia-I Wu <[email protected]>
* tgsi: change tgsi_shader_info::properties to a one-dimensional arrayMarek Olšák2014-10-0413-24/+23
| | | | | | Reviewed-by: Roland Scheidegger <[email protected]> v2: fix svga too
* radeonsi: set number of userdata SGPRs of GS copy shader to 4Marek Olšák2014-10-043-10/+23
| | | | | | | It only needs the constant buffer with clip planes and read-write resources for the GS->VS ring and streamout. That's 2 pointers. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: pass the GS shader directly to si_generate_gs_copy_shaderMarek Olšák2014-10-041-3/+3
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: set LLVMByValAttribute for all descriptor arraysMarek Olšák2014-10-041-10/+7
| | | | | | I hope this is correct. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: make the vertex shader key smallerMarek Olšák2014-10-041-1/+2
| | | | | | We only support 16 vertex attribs, not 32. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: don't flush shader caches when building PM4 shader statesMarek Olšák2014-10-041-8/+0
| | | | | | | | | This is a wrong place to flush caches to say the least. I don't think we need to flush the instruction caches if we don't patch shaders with DMA. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove interp_at_sample from the key, use TGSI_INTERPOLATE_LOC_SAMPLEMarek Olšák2014-10-043-5/+2
| | | | | | | | | st/mesa has the same flag in its shader key, we don't need to do it in the driver anymore. Instead, use TGSI_INTERPOLATE_LOC_SAMPLE, which is what st/mesa sets. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: move geometry shader properties from si_shader to si_shader_selectorMarek Olšák2014-10-044-29/+38
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: always compile shaders on demandMarek Olšák2014-10-041-13/+3
| | | | | | | The first compiled shader is sometimes useless, because the key doesn't match the key for the draw call where it's used. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove unused variable si_shader::gs_input_primMarek Olšák2014-10-042-3/+0
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* tgsi: remove some not so useful variables from tgsi_shader_infoMarek Olšák2014-10-046-20/+14
|
* radeonsi: get fs_write_all from tgsi_shader_info directlyMarek Olšák2014-10-043-16/+3
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* tgsi: simplify shader properties in tgsi_shader_infoMarek Olšák2014-10-049-136/+70
| | | | Use an array of properties indexed by TGSI_PROPERTY_* definitions.
* radeonsi: get tgsi_shader_info only once before compilationMarek Olšák2014-10-043-21/+16
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* gallium/util: add util_bitcount64Marek Olšák2014-10-041-0/+12
| | | | | | | | I'll need this in radeonsi. v2: use __builtin_popcountll if available Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix CS tracing and remove excessive CS dumpingMarek Olšák2014-10-043-35/+25
|
* gk110/ir: add dnz flag emission for fmul/fmadIlia Mirkin2014-10-031-0/+4
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]>
* gm107/ir: add dnz emission for fmulIlia Mirkin2014-10-031-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.3" <[email protected]>