| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Fix SIMD mode comment (caught by Eric Anholt).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: Drop unused num_components variable; fix SIMD Mode comment
(caught by Eric Anholt).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The brw_eu_emit.c code manually forces the header present bit when
used in align1 (scalar) mode. So, this has no effect currently.
However, it is nice to have fs_inst::header_present reflect reality.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
This is similar to what Eric did for Gen7 a little while ago; it also
has support for untyped surface reads.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Otherwise we crash when setting up atomic buffer objects.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Francisco made brw_mark_surface_used a freestanding function in
commit a32817f3c248125fb537c3a915566445e5600d45. We should use it.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
This must not have existed when I wrote the original code. The atomic
operation header setup code uses this.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For platforms using hardware contexts (currently Gen6+), we failed to
emit PIPELINE_SELECT and 3DSTATE_VF_STATISTICS, instead emitting MI_NOOP
for both.
During one of the context initialization reordering patches, we
accidentally moved brw_init_state before we set brw->CMD_PIPELINE_SELECT
and brw->CMD_VF_STATISTICS. So, when brw_init_state uploaded initial
GPU state (brw_init_state -> brw_upload_initial_gpu_state ->
brw_upload_invariant_state), these would be 0 (MI_NOOP).
Storing the commands in the context is not worthwhile. We have many
generation checks in our state upload code, and for platforms with
hardware contexts, this only gets called once per GL context anyway.
The cost is negligable, and it's easy to botch context creation
ordering.
This may fix hangs on Gen6+ when using the media pipeline.
Cc: "10.0 10.1" <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
arekm reported that using Chrome with GPU acceleration enabled on GM45
triggered the hw_ctx != NULL assertion in brw_get_graphics_reset_status.
We definitely do not want to advertise reset notification support on
Gen4-5 systems, since it needs hardware contexts, and we never even
request a hardware context on those systems.
Cc: "10.1" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75723
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This was introduced with the comment and code below it, though the code
only touches prog_data (CACHE_NEW_WM_PROG).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This keeps us from having to emit the nonpipelined state packet on every
FBO binding.
-4.42003% +/- 1.09961% effect on cairo-perf-trace runtime on glamor (n=110).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
No more walking 96*6 pointers looking to see if they're the current
texture, when we only use the first 2 out of 96 units. -6.26002% +/-
1.87817% effect on cairo runtime on no-fbo-cache glamor (n=36).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of walking 6 shader stages for each of the 96 combined texture
image units, now we just walk the samplers used in each shader stage.
With cairo-perf-trace on Xephyr with glamor, I'm seeing a -6.50518% +/-
2.55601% effect on runtime (n=22) since the "drop _EnabledUnits" change.
No significant performance difference on an apitrace of minecraft (n=442).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
I want to avoid walking the entire long array texture image units, but the
obvious way to do so means walking program samplers, and thus hitting the
units in a random order.
This change replaces the previous behavior of only setting up the fallback
texture for a fragment shader with setting up the fallback texture for any
shader that's missing a complete texture of the right target in its unit.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This is kind of ugly, but I think it's worth it to finish off the last
consumers of _ReallyEnabled.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
The _MaxEnabledTexImageUnit check assures us that Unit[0].Current != NULL.
This is the last consumer of _ReallyEnabled outside of the radeons.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We can just look at _Current's target.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm probably not the only person that has tried to kill _ReallyEnabled.
This does the mechanical part of the work, and cleans _ReallyEnabled from
i965.
I think that using _Current makes texture management clearer: You can't
have multiple targets in use in the same texture image unit at the same
time, because there's just that one pointer.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
I'm going to try to delete _ReallyEnabled, which is this weird bitfield
with either 0 or 1 bits set with just the reference to _Current.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The field wasn't really valid, since we've got more than 32 units now. It
turns out it was mostly just used for checking != 0, or checking for fixed
function coordinates, though.
v2: Fix mis-conversion in xm_line.c (caught by Ken).
Reviewed-by: Matt Turner <[email protected]> (v1)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
_EnabledUnits is all of the first 32 image units that are used by fixed
function or programs, while _EnabledCoordUnits is just which fixed function
fragment shader texcoords need to be generated. This is a theoretical bugfix
in the case of a vertex shader texturing from large texture image unit number
(we'd end up flagging something other than a VARYING_SLOT_TEXn as needing to
be generated), but it's actually just motivated by trying to kill
_EnabledUnits.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We now know what the max unit is in the context state.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The original GL_EXT_texture_swizzle extensions said GL_INVALID_OPERATION
was to be generated when the an invalid swizzle was passed to
glTexParameter(). But in OpenGL 3.3 and later, the error should be
GL_INVALID_ENUM.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible that there are multiple input buffers but only one is
relevant for translation. Then there will be only a single translation
group, which might need to source data from a buffer index != 0.
Fixes wrong vertex shader inputs as observed while debugging with an
application and driver combination that requires translation of a
vertex attribute in a non-trivial set of attributes and input buffers.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Igor Gnatenko:
v2: PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY instead of
PIPE_COMPUTE_MAX_CLOCK_FREQUENCY
Bruno Jiménez:
v3: Drivers report clock in Mhz
Signed-off-by: Igor Gnatenko <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Igor Gnatenko:
v2: in define RADEON_INFO_MAX_SCLK use 0x1a instead of 0x19 (upstream changes)
Bruno Jiménez:
v3: Convert the frequency to MHz from kHz after getting it in
'do_winsys_init'
Signed-off-by: Igor Gnatenko <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
| |
Bruno Jiménez:
v2: Updated the docs
v3: Remove trailing comma
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We intended to set these 64-bit addresses to 0, and set the enable bit.
But, I accidentally placed the DWord with the high bits first, when it
should have been second.
This generally worked out, by luck - presumably General State Base
Address is initially zero, and ends up remaining that way in our
contexts since we bungled the "modify enable" bit.
v2: Fix MOCS shift on GSBA. It should be 4, and I had 2.
(Caught by Ben Widawsky.)
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The implementation is basically a NOP but it conforms with OpenCL 1.2.
[ Francisco Jerez: Initialize property return buffer for
CL_DEVICE_PARTITION_PROPERTIES, CL_DEVICE_PARTITION_TYPE,
CL_DEVICE_PARTITION_AFFINITY_DOMAIN, and make the latter a scalar
rather than a vector. Some clean-up and code style fixes. ]
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
| |
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v2: use a new variable for aligned size
add comment
make both vars const
only use the aligned value in argument constructors
fix comment typo
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
| |
Acked-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
| |
The C++ headers are *not* updated because they rely on CL 1.2 APIs
that we do not implement yet when the core CL 1.2 headers are present.
Acked-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes textureGatherOffset when used with a shadow sampler. Also verified
against blob compiler with textureLodOffset manually (no piglit tests
for texture[Lod]Offset + shadow samplers).
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
This allows us to have non-constant offsets for textureGatherOffset and
textureGatherOffsets.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This adds support for:
IBFE, UBFE, BFI, LSB, IMSB, UMSB, BREV, POPC
Which are all required for ARB_gs5 support.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
Add bitwise reversing and signed MSB helpers for software implementation
of the new TGSI opcodes.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Also pipe through [IU]MUL_HI, MAD, and lower ldexp. This provides
coverage of all new ARB_gpu_shader5 functions except uaddCarry,
usubBorrow and interpolateAt*.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|