| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
This will allow the generic parts to be re-used for geometry shaders.
Reviewed-by: Jordan Justen <[email protected]>
v2: Put urb_read_length and urb_entry_size in the generic struct.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I want to introduce some more debug output for performance surprises that
includes fallbacks, but aren't necessarily software rasterization. Leave
INTEL_DEBUG=fall in place for those that have used that flag before.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This is used by the unit state, which is at emit() time.
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I initially produced the patch using this bash command:
for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i
's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i
's/GL_FALSE/false/g' $file; done
Then I manually added #include <stdbool.h> to fix compilation errors,
and converted a few functions back to GLboolean that were used in core
Mesa's function pointer table to avoid "incompatible pointer" warnings.
Finally, I cleaned up some whitespace issues introduced by the change.
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Chad Versace <[email protected]>
Acked-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
There is a silicon bug which causes unpredictable behaviour if the
URB_FENCE command should cross a cache-line boundary. Pad before the
command to avoid such occurrences. As this command only applies to
gen4/5, do the fixup unconditionally as the specs do not actually state
for which chip it was fixed (and the cost is negligible)...
Signed-off-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
| |
This provides the optimizer with hints about code hotness, which we're
quite certain about for debug printouts (or, rather, while we
developers often hit the checks for debug printouts, we don't care
about performance while doing so).
|
|
|
|
|
|
|
|
|
| |
Rename old IGDNG to Ironlake, and set 'gen' number for
Ironlake as 5, so tracking the features with generation num
instead of special is_ironlake flag.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
|
| |
|
|
|
|
| |
Saves ~2KB of code.
|
|
|
|
| |
Saves ~480 bytes of code.
|
|
|
|
|
|
|
|
|
|
| |
1. new PCI ids
2. fix some 3D commands on new chipset
3. fix send instruction on new chipset
4. new VUE vertex header
5. ff_sync message (added by Zou Nan Hai <[email protected]>)
6. the offset in JMPI is in unit of 64bits on new chipset
7. new cube map layout
|
|
|
|
|
| |
This improves the performance of my GLSL demo by 30%. It also fixes the
VS deadlock that ut2004 had, for reasons I can't explain. Bug #21330.
|
|
|
|
|
|
|
|
|
| |
The clip thread could potentially deadlock when processing tristrips since
being moved back to dual-thread mode, as the two threads could each have 4 VUEs
referenced and not be able to allocate another one since SF processing
wasn't able to continue (needing 5 entries before it freed 2).
In constrained URB mode, similar deadlock could even have occurred with
polygons (so we cut back max_threads if we can't handle it any primitive type).
|
|
|
|
|
| |
It shouldn't offer anything new over what's in the docs (except for G4X notes),
but here it's all in one place.
|
|
|
|
| |
Also, add a comment explaining what brw->urb.constrained tries to do.
|
|
|
|
| |
This reverts commit 7c81124d7c4a4d1da9f48cbf7e82ab1a3a970a7a.
|
|
|
|
|
|
|
|
| |
This reverts commit 53675e5c05c0598b7ea206d5c27dbcae786a2c03.
Conflicts:
src/mesa/drivers/dri/i965/brw_wm_surface_state.c
|
|
|
|
|
|
| |
To do this, I had to clean up some of 965 state upload stuff. We may end
up over-emitting state in the aperture overflow case, but that should be rare,
and I'd rather have the simplification of state management.
|
| |
|
|
|
|
|
|
|
| |
e.g. bridge of fate.
If vs output is big, driver may fall back to use 8 urb entries for vs,
unfortunally, for some unknown reason, if vs is working at 4x2 mode,
8 entries is not enough, may lead to gpu hang.
|
|
|
|
|
|
|
|
|
| |
Makes state emission into a 2 phase, prepare sets things up and accounts
the size of all referenced buffer objects. The emit stage then actually
does the batchbuffer touching for emitting the objects.
There is an assert in dri_emit_reloc if a reloc occurs for a buffer
that hasn't been accounted yet.
|
|
|
|
| |
These don't appear to have ever been used.
|
| |
|
|
|
|
| |
values for nr of entries) should meet the requirement.
|
|
|
|
| |
specifically {8,16,32}.
|
|
|
|
|
|
|
|
|
|
| |
between the vertex cache, the vertex shader and the clipping stages,
all of which are competitors for URB entries assigned to the VS unit.
This change reduces the maximum number of clip and VS threads by
enough to ensure that they cannot consume all the available URB
entries, and then reduces the number somewhat more up to an arbitary
amount I discovered by trial and error. Unfortunately trial and error
solutions don't inspire total confidence...
|
|
This driver comes from Tungsten Graphics, with a few further modifications by
Intel.
|