summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* radeon/uvd: fix feedback buffer handling v2Christian König2014-02-041-12/+28
| | | | | | | | | | | Without the correct feedback buffer size UVD runs into an error on each frame, reducing the maximum FPS. v2: fixing Michels comments Signed-off-by: Christian König <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Cc: "10.1" "10.0" "9.2" <[email protected]>
* freedreno: enabling binning and opt by defaultRob Clark2014-02-033-16/+11
| | | | | | | | | Hw binning pass doesn't seem to have broken anything. And optimizing compiler fixes a lot of shaders and doesn't seem to break anything. So re-org slightly FD_MESA_DEBUG params and make both hw binning and optimizer enabled by default. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: new compilerRob Clark2014-02-0317-209/+2777
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new compiler generates a dependency graph of instructions, including a few meta-instructions to handle PHI and preserve some extra information needed for register assignment, etc. The depth pass assigned a weight/depth to each node (based on sum of instruction cycles of a given node and all it's dependent nodes), which is used to schedule instructions. The scheduling takes into account the minimum number of cycles/slots between dependent instructions, etc. Which was something that could not be handled properly with the original compiler (which was more of a naive TGSI translator than an actual compiler). The register assignment is currently split out as a standalone pass. I expect that it will be replaced at some point, once I figure out what to do about relative addressing (which is currently the only thing that should cause fallback to old compiler). There are a couple new debug options for FD_MESA_DEBUG env var: optmsgs - enable debug prints in optimizer optdump - dump instruction graph in .dot format, for example: http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot.png http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot At this point, thanks to proper handling of instruction scheduling, the new compiler fixes a lot of things that were broken before, and does not appear to break anything that was working before[1]. So even though it is not finished, it seems useful to merge it in it's current state. [1] Not merged in this commit, because I'm not sure if it really belongs in mesa tree, but the following commit implements a simple shader emulator, which I've used to compare the output of the new compiler to the original compiler (ie. run it on all the TGSI shaders dumped out via ST_DEBUG=tgsi with various games/apps): https://github.com/freedreno/mesa/commit/163b6306b1660e05ece2f00d264a8393d99b6f12 Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: split out old compilerRob Clark2014-02-036-1/+1531
| | | | | | | | For the time being, keep old compiler as fallback for things that the new compiler does not support yet. Split out as it's own commit to make the later new-compiler commits easier to follow. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: prepare for new compilerRob Clark2014-02-039-146/+308
| | | | | | Shuffle things around to prepare for new compiler. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: remove useless reg tracking in disasm-a3xxRob Clark2014-02-031-174/+0
| | | | | | | Not really used for anything anymore. So strip it out and avoid conflicting symbols with upcoming new-compiler. Signed-off-by: Rob Clark <[email protected]>
* draw: fix incorrect color of flat-shaded clipped linesBrian Paul2014-02-031-1/+12
| | | | | | | | | | | | When we clipped a line weren't copying the provoking vertex color to the second vertex. We also weren't checking for first vs. last provoking vertex. Fixes failures found with the new piglit line-flat-clip-color test. Cc: "10.0, 10.1" <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium/auxiliary/indices: replace free() with FREE()Brian Paul2014-02-031-1/+1
| | | | | | | | To match the CALLOC_STRUCT() call. Cc: "10.0, 10.1" <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* svga: check shader size against max command buffer sizeBrian Paul2014-02-032-12/+49
| | | | | | | If the shader is too large, plug in a dummy shader. This patch also reworks the existing dummy shader code. Reviewed-by: Jose Fonseca <[email protected]>
* svga: refactor some shader codeBrian Paul2014-02-0311-76/+171
| | | | | | | Put common code in new svga_shader.c file. Considate separate vertex/ fragment shader ID generation. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: fix opcode and function nestingZack Rusin2014-02-032-157/+317
| | | | | | | | | | | | | | gallivm soa code supported only a single level of nesting for control flow opcodes (if, switch, loops...) but the d3d10 spec clearly states that those are nested within functions. To support nesting of conditionals inside functions we need to store the nesting data inside function contexts and keep a stack of those. Furthermore we make sure that if nesting for subroutines is deeper than 32 then we simply ignore all subsequent 'call' invocations. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: add a few const qualifiersBrian Paul2014-02-022-4/+4
| | | | Trivial.
* translate: reindent translate_sse.cBrian Paul2014-02-021-472/+474
| | | | Trivial.
* freedreno: better manage our WFI'sRob Clark2014-02-017-24/+36
| | | | | | | | Updates to non-banked registers, CP_LOAD_STATE, etc, need a WFI if there is potentially pending rendering. Track this better, and add fd_wfi() calls everywhere that might potentially need CP_WAIT_FOR_IDLE. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: add logicopRob Clark2014-02-013-6/+27
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: handle frag z writeRob Clark2014-02-017-25/+53
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: resync generated headersRob Clark2014-02-014-9/+39
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix const confusionRob Clark2014-02-012-9/+9
| | | | | | | | | | | | Gallium can leave const buffers bound above what is used by the current shader. Which can have a couple bad effects: 1) write beyond const space assigned, which can trigger HLSQ lockup 2) double emit of immed consts, first with bound const buffer vals followed by with actual immed vals. This seems to be a sort of undefined condition. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: compiler cleanupsRob Clark2014-02-017-145/+198
| | | | | | Drop color/pos/psize_regid, plus a few compiler and IR cleanups. Signed-off-by: Rob Clark <[email protected]>
* freedreno/compiler/a3xx: remove lowered instructionsRob Clark2014-02-011-354/+0
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: add tgsi lowering passRob Clark2014-02-014-2/+1229
| | | | | | | | | | | | | | | | Currently lowers the following instructions: DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH, DP2 translating these into equivalent simpler TGSI instructions. This probably should be moved to util so other drivers can use it, but just adding under freedreno for now so that I can clear out a lot of the lowering code in a3xx compiler before beginning to add new compiler. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: add CLAMPRob Clark2014-02-011-7/+24
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: various fixesRob Clark2014-02-011-14/+34
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: ctx should hold ref to devRob Clark2014-02-017-3/+9
| | | | | | | | The ctx should hold ref to dev to avoid problems if screen is destroyed before ctx. Doesn't really fix the egl/glx issues, but at least it prevents things from getting much worse. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add prims-emitted driver queryRob Clark2014-02-011-0/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* llvmpipe: fix denorm handling for r11g11b10_float format when blendingRoland Scheidegger2014-01-311-2/+15
| | | | | | | | The code re-enabling denorms for small float formats did not recognize this format due to format handling hacks (mainly, the lp_type doesn't have the floating bit set). Reviewed-by: Jose Fonseca <[email protected]>
* st/dri: Fix tests for no draw/read buffers in dri_make_current()Michel Dänzer2014-01-311-2/+2
| | | | | | Fixes piglit glx/GLX_ARB_create_context/current with no framebuffer. Reviewed-by: Marek Olšák <[email protected]>
* r600g: Removed unnecessary positivity check for unsigned int variable.Siavash Eliasi2014-01-311-1/+1
| | | | Signed-off-by: Marek Olšák <[email protected]>
* st/dri: Allow creating OpenGL 3.3 core contextsMichel Dänzer2014-01-301-1/+1
| | | | | | Enables OpenGL 3.3 piglit tests. Reviewed-by: Marek Olšák <[email protected]>
* freedreno: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64Ian Romanick2014-01-291-0/+3
| | | | | | | | Allocations actually have page alignment, but 64 is still a reasonable value. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* ilo: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64Siavash Eliasi2014-01-291-1/+1
| | | | | | | | | | Ian manually ran the map_buffer_range* tests and the arb_map_buffer_alignment-* tests, but he did not do a full piglit run. v2 (idr): Use 64 instead of 4096 Tested-by: Ian Romanick <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* svga: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64Siavash Eliasi2014-01-291-1/+2
| | | | | | | v2: Fixed setting switch cases prior to PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT incorrectly. Reviewed-by: Ian Romanick <[email protected]>
* i915g: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64Siavash Eliasi2014-01-291-1/+3
| | | | | v2: Fixed setting switch cases prior to PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT incorrectly.
* i915g: Use alignment of 64 instead of 16 for buffer allocationSiavash Eliasi2014-01-291-1/+1
| | | | Reviewed-by: Ian Romanick <[email protected]>
* llvmpipe: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64Siavash Eliasi2014-01-291-1/+2
| | | | | | | v2: Fixed setting switch cases prior to PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT incorrectly. Reviewed-by: Ian Romanick <[email protected]>
* llvmpipe: Use alignment of 64 instead of 16 for buffer allocationSiavash Eliasi2014-01-291-3/+3
| | | | | | v2: Changed allocation alignment of llvmpipe_displaytarget_layout. Reviewed-by: Ian Romanick <[email protected]>
* softpipe: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64Siavash Eliasi2014-01-291-1/+2
| | | | | | | v2: Fixed setting switch cases prior to PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT incorrectly. Reviewed-by: Ian Romanick <[email protected]>
* softpipe: Use alignment of 64 instead of 16 for buffer allocationSiavash Eliasi2014-01-291-2/+2
| | | | | | v2: Changed allocation alignment in softpipe_displaytarget_layout. Reviewed-by: Ian Romanick <[email protected]>
* i915g: support more PIPE_CAPsStéphane Marchesin2014-01-281-3/+6
|
* radeonsi: Put GS ring buffer descriptors with streamout buffer descriptorsMichel Dänzer2014-01-295-84/+115
| | | | | | And mark the constant buffers as read only for the GPU again. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Enable OpenGL 3.3Michel Dänzer2014-01-291-3/+5
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Geometry shader micro-optimizationsMichel Dänzer2014-01-291-12/+10
| | | | | | | | Move parameter loads out of loops, and use the instruction offset instead of a VGPR for the vertex attribute offset when writing to the ESGS ring buffer. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: We don't support indirect addressing of geometry shader inputsMichel Dänzer2014-01-291-0/+4
| | | | | | Fixes piglit spec/glsl-1.50/execution/geometry/dynamic_input_array_index Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Pass VS resource descriptors to the HW ES shader stage as wellMichel Dänzer2014-01-296-34/+58
| | | | | | | This makes sure constants and samplers work in the vertex shader even when a geometry shader is active. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Fix streamout from geometry shaderMichel Dänzer2014-01-291-10/+27
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Simplify shader PM4 state handlingMichel Dänzer2014-01-293-61/+21
| | | | | | | | | | Just always bind the current states before drawing. Besides the simplification, as a bonus this makes sure the VS hardware shader stage always uses the GS copy shader when a geometry shader is active, fixing a number of GS related piglit tests. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Properly match ES outputs to GS inputsMichel Dänzer2014-01-291-5/+16
| | | | | | Fixes piglit vs-gs-arrays-within-blocks-pass. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Really dump TGSI code before any TGSI->LLVM conversion attemptMichel Dänzer2014-01-291-8/+8
| | | | | | While we're at it, use the local variable 'sel'. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Also export clip distances with geometry shaderMichel Dänzer2014-01-292-5/+9
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Take GS into account for VS state in more placesMichel Dänzer2014-01-293-2/+14
| | | | Reviewed-by: Marek Olšák <[email protected]>