| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
| |
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
| |
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
There doesn't seem to be any reason for it to be a method, and it's
surprising that the expression 'reg.retype(t)' doesn't retype its
object but rather it creates a temporary with the new type. Use
'retype(reg, t)' instead.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add assertion that the register is not in the HW_REG or IMM file,
calculate the conjunction of the old and new mask instead of replacing
the old [consistent with the behavior of brw_writemask(), causes no
functional changes right now], make it static inline to let the
compiler do a slightly better job at optimizing things, and shorten
its name.
v2: Assert that the new writemask is not zero to avoid undefined
hardware behaviour.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixed regs.
And define non-mutating helper functions to retype fixed and normal
regs with a common interface. At some point we may want to get rid of
::fixed_hw_reg completely and have fixed regs use the normal register
data members (e.g. backend_reg::reg to select a fixed GRF number,
src_reg::swizzle to store the swizzle, etc.), I have the feeling that
this is not the last headache we're going to get because of the
multiple ways to represent the same thing and the different register
interface depending on the file a register is stored in...
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
| |
::negate.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
| |
Yes, we could avoid having four copies of essentially the same code by
using templates here.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
| |
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
function.
This way it can be used anywhere. I need it from the visitor.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
brw_stage_prog_data.
There doesn't seem to be any reason for nr_params, nr_pull_params,
param, and pull_param to be duplicated in the stage-specific
subclasses of brw_stage_prog_data. Moving their definition to the
common base class will allow some code sharing in a future commit, the
removal of brw_vec4_prog_data_compare and brw_*_prog_data_free, and
the simplification of the stage-specific brw_*_prog_data_compare.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
| |
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
| |
They work fine now, too.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
It appears to work fine.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Broadwell's 3DSTATE_WM_HZ_OP packet makes this much easier.
Instead of programming the whole pipeline, we simply have to emit the
depth/stencil packets, a state override, and a pipe control. Then
arrange for the state to be put back. This is easily done from a single
function.
v2: Use minify(mt->logical_{width,height}0, level) in 3DSTATE_WM_HZ_OP
instead of intel_mipmap_level's width/height fields. Those were
based on the physical width/height, and thus wrong for MSAA buffers.
Eric also deleted those fields.
v3: Use 0xFFFF as the sample mask regardless of what the user set (as
this operation is unrelated); set the drawing rectangle to the
miplevel being operated on, rather than the whole surface; remove
unnecessary MAX2(..., 1) around mt->logical_depth0 (all suggested
by Eric Anholt).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing code followed the vtable function signature, which is not a
great fit: many of the parameters are unused, and the function still
inspects global state, making it less reusable.
This patch refactors the depth buffer packet emission code into a new
function which takes exactly the parameters it needs, and which uses no
global state. It then makes the existing vtable function call the new
one.
Ideally, we would remove the vtable function, and clean up that
interface. But that can happen once HiZ is working.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
We're going to need these to implement HiZ.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Broadwell's "HiZ Resolve" operation still has the restriction that the
rectangle primitive must be 8x4 aligned. So I believe we still need
this.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
HiZ buffers still don't exist, but when they do, we'll set them up.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
brw_depthbuffer_format is not very reusable at the moment, since it
uses global state (ctx->DrawBuffer) to access a particular depth buffer.
For HiZ on Broadwell, I need a function which simply converts the
formats. However, at least one existing user of brw_depthbuffer_format
really wants the existing interface. So, I've created a new function.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even with the other limits raised, TestProxyTexImage would still reject
textures > 1GB in size. This is an artificial limit; nothing prevents
us from having a larger texture. I stayed shy of 2GB to avoid the
larger-than-aperture situation.
For 3D textures, this raises the effective limit:
- RGBA8: 645 -> 738
- RGBA16: 512 -> 586
- RGBA32F: 406 -> 465
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen4+ supports 8192x8192 cube maps. Ivybridge and later can actually
support 16384, but that would place GL_MAX_CUBE_MAP_TEXTURE_SIZE above
GL_MAX_TEXTURE_SIZE, which seems like a bad idea.
(Unfortunately, we can't bump GL_MAX_TEXTURE_SIZE to 16384 without
causing regressions due to awful W-tiled stencil buffer interactions.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's highly unlikely that there will be enough memory in the system to
allocate enough space for this, but we should still expose the hardware
limit. It's what the Intel Windows driver does, and it seems most other
vendors do likewise.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This drops the MOVs for header setup, which are totally mis-scheduled.
total instructions in shared programs: 1590047 -> 1589331 (-0.05%)
instructions in affected programs: 43729 -> 43013 (-1.64%)
GAINED: 0
LOST: 0
glb27-trex:
x before
+ after
+-----------------------------------------------------------------------------+
| + x xx + + + |
| ++ + xxx ++x xx + ** *x+ + + + x * |
|+x xx x* x+++xx*x*xx+++*+*xx++** *x* x+***x*+xx+* + * + + *|
| |__|__________MA___A___________|___| |
+-----------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 49 62.33 65.41 63.49 63.53449 0.62757822
+ 50 62.28 65.4 63.7 63.6982 0.656564
No difference proven at 95.0% confidence
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
The code was removed early last year.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It often confused people because it was unclear on whether it was the
physical or logical, and people needed the other one as well. We can
recompute it trivially using the minify() macro, clarifying which value is
being used and making getting the other value obvious.
v2: Fix a pasteo in intel_blit.c's dst flip.
Reviewed-by: Chris Forbes <[email protected]> (v1)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Since only window system renderbuffers can have a singlesample_mt, this
lets us drop a bunch of sanity checking to make sure that we're just a
renderbuffer-like thing.
v2: Fix a badly-written comment (thanks Kenneth!), drop the now trivial
helper function for set_needs_downsample.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The only DRI2 vs DRI3 delta was just how to decide about frontbuffer-ness
for doing the upsample.
v2: Fix missing singlesample_mt->region->name update in the merged code,
which would have broken the DRI2 don't-recreate-the-miptree
optimization.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Pretty silly to pass in values dereferenced out of one of the arguments.
v2: Get the destination size from the dst, even though the callers are
always dealing with src size == dst size cases.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
So far it's happened to be that we're only ever calling
intel_miptree_blit() (up/downsampling) from the ReadBuffer, but I stumbled
over a null ReadBuffer case when debugging later parts of the series.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Mostly mt->target == 2D_MS just results in a few checks that we don't try
to allocate multiple LODs and don't try to do slice copies with them. But
with the introduction of binding renderbuffers to textures, we need more
consistency.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This lets us simplify our shaders, and rely on GLES-prohibited
functionality (like ARB_texture_multisample) when writing these
driver-internal functions.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Compare this VS to the one for the post-130 case. Fixes piglit
glsl-lod-bias, and presumably tons of other code (I haven't done a full
piglit run on swrast).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74911
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
On a core context, this would throw an error.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit f4ebcd133b9 ("dri/nouveau: NV17_3D class is not available for
NV1a chipset") fixed this partially by using the correct 3d class.
However there were a lot of checks left over comparing against the
chipset.
Reported-and-tested-by: John F. Godfrey <[email protected]>
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: 9.2 10.0 10.1 <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
| |
Surprisingly, the GLSL shaders already wrote the sampled r value to
FragDepth.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51600
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This avoids a CopyTexImage() on Intel i965 hardware without blorp.
v2: Move the !readAtt check up higher.
v3: Rebase on idr's changes, plus readAtt check is totally gone, and also
fix a typo in a comment.
Reviewed-by: Kenneth Graunke <[email protected]> (v2)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will let us use meta's acceleration from renderbuffers without having
to do a CopyTexImage first.
This is like what we do for TFP, but just taking an existing renderbuffer
and binding it to a texture with whatever its format was. The
implementation won't work for stencil renderbuffers, and it only does
non-texture renderbuffers (but then, if you're using a texture
renderbuffer, you can just pull the texture object/level/slice out of the
renderbuffer, anyway).
v2: Don't forget to propagate NumSamples to the teximage.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function is only handling the color case. We can just unindent as
long as we're willing to do the check for the bit outside of the
function.
v2: Rebase on idr's changes, drop readAtt check that's always non-null
anyway (it's a pointer into to the statically-allocated attachments
array in the renderbuffer).
Reviewed-by: Kenneth Graunke <[email protected]> (v1)
|
|
|
|
|
|
|
| |
v2: Drop a bunch of unnecessary includes (by Kenneth), rebase on idr's
changes.
Reviewed-by: Kenneth Graunke <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
| |
I want split some meta.c code off to a separate file, so these functions
can't be static any more.
v2: Rebase on idr's changes, also expose setup_blit_shader,
blit_shader_table_cleanup, setup_vertex_objects,
setup_ff_tnl_for_blit.
Reviewed-by: Kenneth Graunke <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
| |
I'd like to split some of our code to separate files, since 4k lines and
growing is pretty unreasonable for all these separate operations.
v2: Rebase on idr's changes.
Reviewed-by: Kenneth Graunke <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
| |
There was this funny argument passed to setup for "did alloc decide we
need to allocate new texture storage?", which goes away if we don't have
the caller do alloc as a separate step.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the GL_ARB_fbo spec:
If the source and destination buffers are identical, and the
source and destination rectangles overlap, the result of the blit
operation is undefined.
As far as I know, that's the only thing that would have been of concern
for this.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
While these structs are generated per GLSL sampler type, they're structs
of data-about-shaders (notably, the ID of a shader program), not
data-about-samplers.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
Everyone was just immediately calling it and doing nothing else with the
shader program id.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
The only thing that wants to track the glsl_sampler structure is the
shader string generator.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the VEC4 back-end agrees on src_reg::swizzle being one of the
BRW_SWIZZLE macros defined in brw_reg.h, except in two places where we
use Mesa's SWIZZLE macros. There is even a doxygen comment saying
that Mesa's macros are the right ones. They are incompatible swizzle
representations (3 bits vs. 2 bits per component), and the code using
Mesa's works by pure luck. Fix it.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The same effect can be achieved using ::subreg_offset. Remove the
less flexible alternative and define a convenience function to keep
the fs_reg interface sane.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|