summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* loop_unroll: unroll loops with (lowered) breaksLuca Barbieri2010-09-131-4/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the loop ends with an if with one break or in a single break unroll it. Loops that end with a continue will have that continue removed by the redundant jump optimizer. Likewise loops that end with an if-statement with a break at the end of both branches will have the break pulled out after the if-statement. Loops of the form for (...) { do_something1(); if (cond) { do_something2(); break; } else { do_something3(); } } will be unrolled as do_something1(); if (cond) { do_something2(); } else { do_something3(); do_something1(); if (cond) { do_something2(); } else { do_something3(); /* Repeat inserting iterations here.*/ } } ir_lower_jumps can guarantee that all loops are put in this form and thus all loops are now potentially unrollable if an upper bound on the number of iterations can be found. Signed-off-by: Ian Romanick <[email protected]>
* glsl2: Add pass to remove redundant jumpsIan Romanick2010-09-136-1/+118
|
* glsl: Explain file naming conventionIan Romanick2010-09-131-0/+12
|
* loop_controls: fix analysis of already analyzed loopsLuca Barbieri2010-09-131-1/+8
| | | | | | The loop_controls pass didn't look at the counter values it put in ir_loop on previous iterations, so while the first iteration worked, subsequent ones couldn't determine max_iterations.
* i965: Request that returns be lowered in shader mainIan Romanick2010-09-131-0/+1
| | | | Fixes piglit tests glsl-vs-main-return and glsl-fs-main-return.
* glsl: call ir_lower_jumps according to compiler optionsLuca Barbieri2010-09-131-0/+2
|
* glsl: add continue/break/return unification/elimination pass (v2)Luca Barbieri2010-09-136-250/+548
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes in v2: - Base class renamed to ir_control_flow_visitor - Tried to comply with coding style This is a new pass that supersedes ir_if_return and "lowers" jumps to if/else structures. Currently it causes no regressions on softpipe and nv40, but I'm not sure whether the piglit glsl tests are thorough enough, so consider this experimental. It can be asked to: 1. Pull jumps out of ifs where possible 2. Remove all "continue"s, replacing them with an "execute flag" 3. Replace all "break" with a single conditional one at the end of the loop 4. Replace all "return"s with a single return at the end of the function, for the main function and/or other functions This gives several great benefits: 1. All functions can be inlined after this pass 2. nv40 and other pre-DX10 chips without "continue" can be supported 3. nv30 and other pre-DX10 chips with no control flow at all are better supported Note that for full effect we should also teach the unroller to unroll loops with a fixed maximum number of iterations but with the canonical conditional "break" that this pass will insert if asked to. Continues are lowered by adding a per-loop "execute flag", initialized to TRUE, that when cleared inhibits all execution until the end of the loop. Breaks are lowered to continues, plus setting a "break flag" that is checked at the end of the loop, and trigger the unique "break". Returns are lowered to breaks/continues, plus adding a "return flag" that causes loops to break again out of their enclosing loops until all the loops are exited: then the "execute flag" logic will ignore everything until the end of the function. Note that "continue" and "return" can also be implemented by adding a dummy loop and using break. However, this is bad for hardware with limited nesting depth, and prevents further optimization, and thus is not currently performed.
* glsl: add ir_control_flow_visitorLuca Barbieri2010-09-131-0/+17
| | | | | | | | | | This is just a subclass of ir_visitor with empty implementations of all the visit methods for non-control flow nodes. Used to avoid duplicating that in ir_visitor subclasses. ir_hierarchical_visitor is another way to solve this, but is less natural for some applications.
* llvmpipe: Fix non SSE2 builds.José Fonseca2010-09-131-2/+2
| | | | Should fix fdo 30168.
* r300g/swtcl: unlock VBO after draw_flushMarek Olšák2010-09-131-4/+1
| | | | | https://bugs.freedesktop.org/show_bug.cgi?id=29901 https://bugs.freedesktop.org/show_bug.cgi?id=30132
* llvmpipe: Change asm to __asm__.Witold Baryluk2010-09-131-3/+3
| | | | | | | According to gcc documentation both are equivalent, second are prefered as first can make conflict with existing symbols. Signed-off-by: José Fonseca <[email protected]>
* EGL DRI2: 0xa011 is Pineview not IronlakeJesse Barnes2010-09-131-1/+1
| | | | Point about needing a better way to do this validated.
* r600c: const buffer sizes must be a multiple of 16 constsAlex Deucher2010-09-133-29/+21
| | | | This applies to r6xx/r7xx/evergreen
* EGL DRI2: add PCI ID for Ironlake mobileJesse Barnes2010-09-131-0/+1
| | | | Allows KMS EGL driver to load. We need a better way of doing this.
* r600c/eg: remove obselete commentAlex Deucher2010-09-131-2/+0
|
* r600c/eg: remove unused emit timestamp functionAlex Deucher2010-09-131-8/+0
|
* r600c/eg: emit CB_BLEND_ALPHA with the other blend valuesAlex Deucher2010-09-131-5/+5
| | | | saves a few dwords
* r600c: remove redundant state emit on evergreenAlex Deucher2010-09-131-17/+0
| | | | r700start3d already emits the context control packets
* mesa: Revert accidentally committed vertex code chunkKristian Høgsberg2010-09-131-2/+0
|
* r600c: eg: fix typoAndre Maasikas2010-09-131-1/+1
| | | | probably copy/paste error
* r600c: eg: 256 float4 constants may need more than 256 bytesAndre Maasikas2010-09-132-2/+2
|
* r600c: eg - fix uninitialized variableAndre Maasikas2010-09-131-0/+2
|
* glx: Don't destroy DRI2 drawables for legacy glx drawablesKristian Høgsberg2010-09-132-1/+13
| | | | | | | | | | | For GLX 1.3 drawables, we can destroy the DRI2 drawable when the GLX drawable is destroyed. However, for legacy drawables, there os no good way of knowing when the application is done with it, so we just let the DRI2 drawable linger on the server. The server will destroy the DRI2 drawable when it destroys the X drawable or the client exits anyway. https://bugs.freedesktop.org/show_bug.cgi?id=30109
* r300g: fix SWTCLMarek Olšák2010-09-134-41/+99
| | | | https://bugs.freedesktop.org/show_bug.cgi?id=29901
* llvmpipe: Unbreak rasterization on 64bit.José Fonseca2010-09-131-24/+22
|
* gallium: Change the resource_copy_region semantics to allow copies between ↵José Fonseca2010-09-131-3/+5
| | | | different yet compatible formats
* r600g: evergreen fixup dsa state for running query.Dave Airlie2010-09-132-3/+2
| | | | evergreen is always the same as r700 here.
* r600c: remove stray unmap callAndre Maasikas2010-09-131-1/+0
| | | | no idea how/why it got there
* llvmpipe: use gcc asm only with gccJosé Fonseca2010-09-131-1/+1
|
* r300g: print unassigned FS inputs for DBG_RSMarek Olšák2010-09-131-0/+9
|
* r300g: fix map_bufferMarek Olšák2010-09-131-4/+17
| | | | https://bugs.freedesktop.org/show_bug.cgi?id=30145
* r300/compiler: fix warningsMarek Olšák2010-09-132-2/+3
|
* r300g: add new debug options for dumping scissor regs and disabling CBZB clearMarek Olšák2010-09-135-3/+16
|
* r300g: skip rendering if CS space validation failsMarek Olšák2010-09-133-52/+73
| | | | | | | | | radeon_cs_space_check flushes the pipe context on failure, retries the validation, and returns -1 if it fails again. At that point, there is nothing we can do, so let's skip draw operations instead of getting stuck in an infinite loop. This code path ideally should never be hit.
* r300g: remove u_upload_flush from r300_draw_arraysMarek Olšák2010-09-131-1/+0
| | | | | This a leftover probably and is unnecessary, since we flush u_upload_mgr in r300_flush.
* nvfx: Remove unused variables.Vinson Lee2010-09-123-3/+1
|
* nvfx: Move declaration before code.Vinson Lee2010-09-121-6/+12
| | | | Fixes SCons build.
* llvmpipe: introduce tri_3_4 for tiny trianglesKeith Whitwell2010-09-126-46/+127
|
* llvmpipe: allow tri_3_16 at any 4-aligned location within a tileKeith Whitwell2010-09-121-27/+50
| | | | Doesn't require 16-alignment, so catch more cases.
* llvmpipe: refactor tri_3_16Keith Whitwell2010-09-121-17/+47
| | | | | Keep step array as a set of four m128i's and reuse throughout the rasterization.
* llvmpipe: pass linear masks to fragment shaderKeith Whitwell2010-09-123-73/+23
| | | | Fragment shader can extract the correct bits for each quad.
* llvmpipe: fix warnings on both 32 and 64 bit buildsKeith Whitwell2010-09-121-3/+3
|
* llvmpipe: fix wierd performance regression in isosurfKeith Whitwell2010-09-121-6/+8
| | | | | | | | | | | | I really don't understand the mechanism behind this, but it seems like the way data blocks for a scene are malloced, and in particular whether we treat them as stack or a queue, and whether we retain the most recently allocated or least recently allocated has a real affect (~5%) on isosurf framerates... This is probably specific to my distro or even just my machine, but none the less, it's nicer not to see the framerates go in the wrong direction.
* pb: Fix the build, and add notes.José Fonseca2010-09-125-5/+14
|
* llvmpipe: Only generate the whole shader specialization for opaque shaders.José Fonseca2010-09-121-1/+7
| | | | | If not opaque, then the color buffer will have to be read any way, therefore the specialization is pointless.
* pb: add void * for flush ctx to mapping functionsDave Airlie2010-09-1211-28/+24
| | | | | | | | | | If the buffer we are attempting to map is referenced by the unsubmitted command stream for this context, we need to flush the command stream, however to do that we need to be able to access the context at the lowest level map function, currently we set the buffer in the toplevel map, but this racy between context. (we probably have a lot more issues than that.) I'll look into a proper solution as suggested by jrfonseca when I get some time.
* nv30: fix breakage due to 10 texcoord support on nv40Luca Barbieri2010-09-111-2/+2
|
* Add missing files to the tarball file lists.Chia-I Wu2010-09-121-1/+6
|
* mesa: Fix depend.es[12] generation when LLVM is enabled.Chia-I Wu2010-09-122-29/+27
| | | | | | "llvm-config --cflags" outputs -f options, which conflict makedepend. Clean up compiler flags and append LLVM_CFLAGS to the new xxx_CFLAGS instead of xxx_CPPFLAGS, where xxx may be MESA, ES1, or ES2.
* r600g: Undo bo placement change.Tilman Sauerbeck2010-09-111-1/+1
| | | | | | | This reverts a part of e795ca8f3175fa6fd97b6b2ef2775e3f8803012a that causes artefacts and a performance drop. Signed-off-by: Tilman Sauerbeck <[email protected]>