| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Since each kind of device has its own brw_device_info structure, we can
simply store the URB and thread limits there. This eliminates all the
large if-ladders, and simplifies the context initialization code quite a
bit.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This option was useful during initial development, but it's been ages
since I've heard of anyone using it. Plus, Gen7+ mandates separate
stencil, so it was really only useful on Sandybridge anyway.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea is that struct brw_device_info should store statically-known
information about hardware features. Using the new family name in the
PCI ID table, we can easily grab the right structure.
This is basically the equivalent of intel_device_info in the kernel.
This patch also makes the new structure available from intel_screen, but
nothing uses it. Right now, it looks very redundant with existing
fields, but that will change.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
I removed this a while ago, since we never used it, but I'm finally
resurrecting the idea in the next commits.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Nothing uses the #define name, and it's not terribly useful - the
numerical ID serves the same purpose. The only thing we could really do
with it is generate slightly prettier preprocessed code. But who looks
at that?
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using a helper function clarifies the context initialization code.
I would've liked to completely centralize it, but moving the optionCache
code from intelInitExtensions into here would've required setting flags
in the context, which seems like a waste.
v2: Rebase for the introduction of disable_derivative_optimization.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that intelInitContext isn't shared between i915 and i965, the split
is fairly arbitrary. This patch moves a bunch of the basic context
creation and generation checking code up to the top-level function
(and slightly earlier).
More will follow.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
It wasn't clear that this was necessary for EGL, or why.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that there isn't an intel_context structure, the split between
brw_context.[ch] and intel_context.[ch] is rather awkward and arbitrary.
Removing intel_context.[ch] seems desirable, but not everything really
belongs in brw_context.[ch], either.
Moving INTEL_DEBUG handling into separate intel_debug.[ch] files should
make them relatively easy to find.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
"error" is a very generic name. dri_ctx_error is the name used in
intelInitContext(), which is more specific.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Nobody else yet can do a forward context anyway, but others should be able
to do debug contexts, and those would have just had no effect currently.
|
|
|
|
|
|
| |
Now that we support start, assert on start + num < max samplers
Reported by xexaxo
|
|
|
|
|
|
|
|
|
|
|
| |
Mesa now supports OpenGL 3.2 and GLSL 1.50, so bump the Mesa major
version from 9 to 10 to reflect this.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geometry shader support is now working well, and adequately piglit
tested. There are just a few piglit failures left to fix. So there's
no need for an "experimental" warning anymore.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geometry shaders were the last thing we needed to finish before
turning on GLSL 1.50 and GL 3.2 support. They are now working well,
with just a few piglit failures left to fix.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
With code dump enabled LLVM may generate disassembly during compilation.
Show this disassembly when available and prefer it to SI bytecode dump.
Reviewed-by: Tom Stellard <[email protected]>
Signed-off-by: Jay Cornwall <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix coord wrapping (and face selection too) in case of edges.
Unfortunately, the coord wrapping is way more complicated than what
the code did, as it depends on the face and the direction where the
texel falls off the face (the logic needed to get this right in fact
seems utterly ridiculous).
Also fix a bug in (y direction under/overflow) face selection.
And get rid of complicated cube corner handling. Just like edge case,
the coord wrapping was wrong and it seems very difficult to fix.
I'm near certain it can't always work anyway (though ordinary seamless
filtering on edge has actually a similar problem but not as severe)
because we don't have per-pixel face, hence could have multiple corner
texels which would make it very difficult to average the remaining texels
correctly. Hence simply pick a texel which would only have fallen off one
edge but not both instead, which is not quite accurate but actually I think
should be enough to meet OpenGL (but not d3d10) requirements.
v2: small fixes suggested by Brian, add some comments.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous limit of of 128*1024 was reported to cause frequent recompiles
in some apps due to shader variant thrashing on IRC in some apps leading
to noticeable lags.
Note that the LP_MAX_SHADER_VARIANTS limit (1024) was more or less impossible
to reach, since even simple fragment shaders without texturing (glxgears) used
more than twice than 128 instructions, hence the instruction limit would have
always been reached first (excluding things like trivial shaders not writing
color). Even with the new limit it is VERY likely the instruction limit is hit
first.
Should help with such lags due to recompiles (though other shader types have
their own limits, LP_MAX_SETUP_VARIANTS and DRAW_MAX_SHADER_VARIANTS, in
particular the latter seems a bit small (128)).
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes this build error.
CC imports.lo
../../src/mesa/main/imports.c: In function '_mesa_strtof':
../../src/mesa/main/imports.c:570:20: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'loc'
../../src/mesa/main/imports.c:570:20: error: 'loc' undeclared (first use in this function)
../../src/mesa/main/imports.c:570:20: note: each undeclared identifier is reported only once for each function it appears in
../../src/mesa/main/imports.c:572:7: error: implicit declaration of function 'newlocale'
../../src/mesa/main/imports.c:572:23: error: 'LC_CTYPE_MASK' undeclared (first use in this function)
../../src/mesa/main/imports.c:574:4: error: implicit declaration of function 'strtof_l'
../../src/mesa/main/imports.c:580:1: warning: control reaches end of non-void function
Signed-off-by: Vinson Lee <[email protected]>
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Use GL_TRUE/FALSE instead of 1/0. Remove extraneous parentheses.
Remove trailing whitespace.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that libEGL has been fixed to not leak all kinds of symbols, gbm
links to its own copy of the libwayland-drm.a helper library. That means
we can't rely on comparing the addresses of a static vtable symbol in that
library to determine if a wl_buffer is a wl_drm_buffer. Instead, we
move the vtable into the wl_drm struct and use that for comparing.
https://bugs.freedesktop.org/show_bug.cgi?id=69437
Cc: 9.2 <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
execinfo.h is not available on NetBSD.
Fixes this bulid error.
CC glapi_gentable.lo
glapi_gentable.c:44:22: fatal error: execinfo.h: No such file or directory
Signed-off-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
|
| |
This was overriding the top-level .dir-locals.el causing some settings
(like forcing spaces instead of tabs!) to be lost.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
We should be able to safely set the framebuffer state without a
fragment shader bound. bind_ps_state will take care of updating the
necessary state bits later.
v2: check in update_db_shader_control
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes GL2ExtensionTests/egl_image_external/TestSimpleUnassociated.test
which is part of gles2/3 conformance suite. Here image external
textures are switched to be treated the same as 2D textures. These
can be associated with the fallback texture providing fixed sample
values of (0, 0, 0, 1).
The OES_EGL_image_external spec says:
"Sampling an external texture which is not associated with any EGLImage
sibling will return a sample value of (0,0,0,1)."
"External textures cannot be used with TexImage2D, TexSubImage2D,
CompressedTexImage2D, CompressedTexSubImage2D, CopyTexImage2D, or
CopyTexSubImage2D, and an INVALID_ENUM error will be generated if
this is attempted."
And quoting Chad:
"That's enforced in _mesa_TexImage*() by calling
legal_teximage_target(), and enforced in _mesa_TexSubImage*() by
calling legal_texsubmimage_target(). Each of the
legal_tex*image_target() functions reject external textures.
Therefore, allowing GL_TEXTURE_EXTERNAL_OES in store_texsubimage()
won't violate the above spec quote.
I think it's safe to allow GL_TEXTURE_EXTERNAL_OES in
store_texsubimage(), as long as the texture has only a single
plane. Luckily, that's the only type of external textures that
Mesa currently supports."
CC: Chad Versace <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Signed-off-by: Chad Versace <Chad Versace [email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extend the fast texture upload from BGRA X-tiled to include RGBA,
Alpha/Luminance, and Y-tiled. Speed improvements, measured with
mesa demos teximage program, on 256 x 256 texture, in MB/s, on a
Sandy Bridge (Ivy is comparable):
before after increase
BGRA/X-tiled 3266 4524 1.39x
BGRA/Y-tiled 1739 3971 2.28x
RGBA/X-tiled 474 4694 9.90x
RGBA/Y-tiled 477 3368 7.06x
L/X-tiled 1268 1516 1.20x
L/Y-tiled 1439 1581 1.10x
v2: Cosmetic changes only: reformat and reword comments, make doxygen-friendly,
rename variables, use existing macros, add an assert.
Signed-off-by: Frank Henigman <[email protected]>
Reviewed-and-tested-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
| |
* Fix LLVM library and defines
* Only enable tracing when scons build=debug
Acked-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
* /boot/common no longer exists in Haiku as of
a few days ago (and this is undefined)
Acked-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is part of the prep for megadrivers, which won't allow using a single
global symbol due to the fact that there will be multiple drivers built
into the same dri.so file. For that, we'll need screen init to take a
reference to the driver to set up this vtable.
v2: Fix two missed references to driDriverAPI.
Reviewed-by: Kenneth Graunke <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intel_screen.c used to be a dispatch to one of 3 driver functions, but
was down to 1, so it was kind of a waste. In addition, it was trying to
free all of the data that might have been partially freed in the kernel
3.6 check (which comes after intelInitContext, and thus might have had
driverPrivate set and result in intelDestroyContext() doing work on the
freed data). By moving the driverPrivate setup earlier, we can use
intelDestroyContext() consistently and avoid such problems in the future.
v2: Adjust the prototype of brwCreateContext to use the proper enum
(fixing a compiler warning in some builds)
Reviewed-by: Kenneth Graunke <[email protected]> (v1)
|
|
|
|
|
|
|
|
| |
If bufmgr didn't get created, then screen creation failed, and we never
should have got here in the first place. This was added by Chris Wilson
in 2010 with no explanation for why it would be needed.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
i965, i915, radeon, r200, swrast, and nouveau were mostly trying to do the
same logic, except where they failed to. Notably, swrast had code that
appeared to try to enable GLES1/2 but forgot to set api_mask (thus
preventing any gles context from being created), and the non-intel drivers
didn't support MESA_GL_VERSION_OVERRIDE.
nouveau still relies on _mesa_compute_version(), because I don't know what
its limits actually are, and gallium drivers don't declare limits up front
at all. I think I've heard talk about doing so, though.
v2: Compat max version should be 30 (noted by Ken)
Drop r100's custom max version check, too (noted by Emil Velikov)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only important difference was not calling drmGetVersion, and making
the swrast extension vtable. That doesn't justify duplicating the other
330 lines of code.
v2: fix the scons build (code by Emil Velikov)
v3: fix scons build with swrast-only (code by Emil Velikov)
v4: Drop the new define I added, when we already have __NOT_HAVE_DRM_H.
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
The code it was referencing was removed in 2010.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Looking at Lightsmark's shaders, the way we used MRFs (or in gen7's
case, GRFs) was bad in a couple of ways. One was that it prevented
compute-to-MRF for the common case of a texcoord that gets used
exactly once, but where the texcoord setup all gets emitted before the
texture calls (such as when it's a bare fragment shader input, which
gets interpolated before processing main()). Another was that it
introduced a bunch of dependencies that constrained scheduling, and
forced waits for texture operations to be done before they are
required. For example, we can now move the compute-to-MRF
interpolation for the second texture send down after the first send.
The downside is that this generally prevents
remove_duplicate_mrf_writes() from doing anything, whereas previously
it avoided work for the case of sampling from the same texcoord twice.
However, I suspect that most of the win that originally justified that
code was in avoiding the WAR stall on the first send, which this patch
also avoids, rather than the small cost of the extra instruction. We
see instruction count regressions in shaders in unigine, yofrankie,
savage2, hon, and gstreamer.
Improves GLB2.7 performance by 0.633628% +/- 0.491809% (n=121/125, avg of
~66fps, outliers below 61 dropped).
Improves openarena performance by 1.01092% +/- 0.66897% (n=425).
No significant difference on Lightsmark (n=44).
v2: Squash in the fix for register unspilling for send-from-GRF, fixing a
segfault in lightsmark.
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For texturing from GRFs, we now have payloads of arbitrary sizes up to the
message length limit.
v2 (Kenneth Graunke): Rebase on intel_context -> brw_context change.
v3: Add some comment text.
v4: Change some magic 16s to BRW_MAX_MRF (noted by Ken). Leave the 11,
which is the magic "max sampler message length". BRW_MAX_MRF sizing
on the little int arrays is retained because I could see us needing to
extend in the future if we move to GRFs for FB writes (those go to at
least 12 long in a quick scan of the specs)
Reviewed-by: Kenneth Graunke <[email protected]> (v2)
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This will let us coalesce into texture-from-GRF arguments, which would
otherwise be prevented due to the live interval for the whole vgrf
extending across all the MOVs setting up the channels of the message
v2 (Kenneth Graunke): Rebase for renames.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
v2 (Kenneth Graunke): Rebase on s/live_variables/live_intervals/g.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Now optimization passes will be able to look at the per-channel ranges.
v2: Rebase on various optimization pass changes.
v3 (Kenneth Graunke): Rename live_variables to live_intervals; split
introduction of invalidate_live_intervals() into a separate patch.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When compacting the list of VGRFs, we patch up the live interval ranges
(which are indexed by VGRF number). Unfortunately, once we make
per-component data available, this will become too complicated to
maintain. Instead, simply invalidate them.
This was pulled out of a patch by Eric Anholt.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In compute_live_intervals(), start and end are shorter names for
the virtual_grf_start and virtual_grf_end class members.
Now that the fs_live_intervals class has arrays named start and end
which are indexed by var, rather than VGRF, reusing the name is
confusing. Plus, most of the code has been factored out, so using the
long names isn't as inconvenient.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is the information we'll actually use to replace the
virtual_grf_start[]/end[] arrays.
No change in shader-db.
v2 (Kenneth Graunke): Rebase; minor comment updates.
Reviewed-by: Kenneth Graunke <[email protected]>
|