| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
The existing code used RNDD, which rounds down, rather than toward zero.
|
|
|
|
| |
Fixes piglit test glsl-fs-ceil.
|
|
|
|
|
|
|
|
|
|
| |
This cuts usually 2 out of 3 instructions for flag reg generation (if
statements, conditional assignment) by producing the conditional mod
in the expression representing the boolean value.
Fixes glsl-fs-vec4-indexing-temp-dst-in-nested-loop-combined (register
allocation no longer fails for the conditional generation
proliferation)
|
|
|
|
|
|
| |
This will be a place to peephole comparisions directly to the flag
regs, and for now avoids using MOV with conditional mod on gen6, which
is now illegal.
|
|
|
|
| |
Improves nexuiz performance 0.91% (+/- 0.54%, n=8)
|
| |
|
|
|
|
| |
So far, I've only seen this be a valgrind warning and not a real failure.
|
|
|
|
| |
This reverts commit 73dab75b4165f7d2214a68d4ba8e3cb7aab9b4ac.
|
|
|
|
|
|
| |
Don't use r0 for FF_SYNC dest reg on Sandybridge, which would
smash FFID field in GS payload, that cause later URB write fail.
Also not use r0 in any URB write requiring allocate.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
__GLcontextModes is always only used as an implementation internal struct
at this point and we shouldn't install glcore.h anymore. Anything that
needs __GLcontextModes should just include the struct in its headers files
directly.
|
|
|
|
|
|
| |
Fixes this GCC warning.
tdfx_texman.c: In function 'tdfxTMMoveOutTM_NoLock':
tdfx_texman.c:897: warning: unused variable 'shared'
|
|
|
|
|
|
|
| |
Fixes this GCC warning.
r300_state.c: In function 'r300InvalidateState':
r300_state.c:2247: warning: 'hw_format' may be used uninitialized in this function
r300_state.c:2247: note: 'hw_format' was declared here
|
|
|
|
|
|
|
|
|
| |
There was a check to only do the rebase if we didn't have everything
in VBOs, but nexuiz apparently hands us a mix of VBOs and arrays,
resulting in blocking on the GPU to do a rebase.
Improves nexuiz 800x600, high-settings performance on my Ironlake 41%
(+/- 1.3%), from 14.0fps to 19.7fps.
|
|
|
|
|
|
| |
The format selection of the CopyTexSubImage is pretty bogus still, but
this at least avoids software fallbacks in nexuiz, bringing
performance from 7.5fps to 12.8fps on my machine.
|
|
|
|
| |
Fixes glsl-fs-i2b.
|
|
|
|
|
| |
Useful to amortize the command submission/reloc overhead (e.g. etracer
goes from 72 to 109 FPS on nv4b).
|
| |
|
|
|
|
|
|
|
|
|
|
| |
It's now much more correct for gen6 than the old backend, with just 2
regressions I've found (one of which is common with pre-gen6 and will
be fixed by an array splitting IR pass).
This does leave the old Mesa IR backend getting used still when we
don't have GLSL IR, but the plan is to get GLSL IR input to the driver
for the ARB programs and fixed function by the next release.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pre-gen6, you could mix int and float just fine. Now, you get goofy
results.
Fixes:
glsl-arb-fragment-coord-conventions
glsl-fs-fragcoord
glsl-fs-if-greater
glsl-fs-if-greater-equal
glsl-fs-if-less
glsl-fs-if-less-equal
|
|
|
|
|
| |
There was code to do this for pre-gen6 already, this just enables it
for gen6 as well.
|
|
|
|
|
|
|
|
|
|
| |
This is a hw requirement in math args. This also is inefficient, as
we're calculating the same result 8 times, but then we've been doing
that on pre-gen6 as well. If we're doing math on uniforms, though,
we'd probably be better served by having some sort of mechanism for
precalculating those results into another uniform value to use.
Fixes 7 piglit math tests.
|
| |
|
| |
|
|
|
|
|
| |
This was leftover from the pre-gen6 cleanups. One tests regresses
where compute-to-MRF now occurs.
|
|
|
|
|
|
|
|
|
| |
Add ability to set the GLSL version used by the GLcontext by setting the
environment variable INTEL_GLSL_VERSION. For example,
env INTEL_GLSL_VERSION=130 prog args
If the environment variable is missing, the GLSL versions defaults to 120.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By calling radeon_draw_buffers (which sets the necessary flags
in radeon->NewGLState) and revalidating if NewGLState is non-zero
in r200TclPrimitive. This fixes an assert in libdrm (the color-/
depthbuffer was changed but not yet validated) and and stops the
kernel cs checker from complaining about them (when they're too
small).
Thanks to Mario Kleiner for the hint to call radeon_draw_buffer
(instead of my half-broken hack).
v2: Also fix the swtcl r200 path.
Cc: Mario Kleiner <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
|
|
|
|
|
|
| |
This didn't produce a statistically significant performance difference
in my demo (n=4) or nexuiz (n=3), but it still seems like a good idea
and is recommended by the HW team.
|
| |
|
|
|
|
| |
This is progress towards enabling a compute-to-MRF pass.
|
|
|
|
| |
It's time to start splitting some of this up.
|
|
|
|
|
|
| |
While I don't know of any performance changes from this (once extra
reg available out of 128), it makes the generated asm a lot cleaner
looking.
|
|
|
|
|
|
| |
Having the single opcode write then read the reg meant that single
instruction opcodes had to consider their source regs to interfere
with their dest regs.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Fixes these GCC warnings.
brw_wm_fp.c: In function 'search_or_add_const4f':
brw_wm_fp.c:92: warning: 'reg.Index2' is used uninitialized in this function
brw_wm_fp.c:84: note: 'reg.Index2' was declared here
brw_wm_fp.c:92: warning: 'reg.RelAddr2' is used uninitialized in this function
brw_wm_fp.c:84: note: 'reg.RelAddr2' was declared here
|
|
|
|
|
|
| |
Fixes this GCC warning.
brw_vs.c: In function 'do_vs_prog':
brw_vs.c:46: warning: unused variable 'ctx'
|
|
|
|
|
|
| |
Fixes this GCC warning.
brw_eu_emit.c: In function 'brw_math2':
brw_eu_emit.c:1189: warning: unused variable 'intel'
|
|
|
|
|
|
| |
Fixes this GCC warning.
i915_vtbl.c: In function 'i915_assert_not_dirty':
i915_vtbl.c:670: warning: unused variable 'dirty'
|
|
|
|
|
|
| |
Fixes this GCC warning.
i830_vtbl.c: In function 'i830_assert_not_dirty':
i830_vtbl.c:704: warning: unused variable 'i830'
|
| |
|
|
|
|
|
|
|
| |
Improves performance of my GLSL demo 14.3% (+/- 4%, n=4) by
eliminating the moves used in ir_assignment and ir_swizzle handling.
Still 16.5% to go to catch up to the Mesa IR backend, presumably
because instructions are almost perfectly mis-scheduled now.
|
|
|
|
|
|
|
|
|
| |
We were trying to remap a fully-filled array down to only handing the
WM the components it uses. This is called attribute swizzling, and if
you don't enable it you just get 1:1 mappings of inputs to outputs.
This almost fixes glsl-routing, except for the highest gl_TexCoord[]
indices.
|
|
|
|
| |
We'd overwrite the same element twice.
|
|
|
|
|
|
| |
We would compute a new buffer, but never point the hardware at the new
buffer. This partially fixes glsl-routing, as now it get the updated
uniform for which attribute to draw.
|