| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
The system values handled by vec4_visitor::visit(ir_variable *) are
VS-specific (vertex ID and instance ID). This patch moves the
handling of those values into a new virtual function,
make_reg_for_system_value(), so that this VS-specific code won't be
inherited by geomtry shaders.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following vec4_visitor functions virtual, since
they will need to be implemented differently for vertex and geometry
shaders. Some of the functions are renamed to reflect their generic
purpose, rather than their VS-specific behaviour:
- setup_attributes
- emit_attribute_fixups (renamed to emit_prolog)
- emit_vertex_program_code (renamed to emit_program_code)
- emit_urb_writes (renamed to emit_thread_end)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
| |
This patch just creates the derived class; later patches will migrate
VS-specific functions and data structures from the base class into the
derived class.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
| |
This will allow the generic parts to be re-used for geometry shaders.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
v2: Put urb_read_length and urb_entry_size in the generic struct.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
| |
This will allow the generic parts to be re-used for geometry shaders.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
| |
This will allow the generic parts to be re-used for geometry shaders.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In patches that follow, we'll be splitting structs brw_vs_prog_data
and brw_vs_compile into a vec4-generic base struct and a VS-specific
derived struct (this will allow the vec4-generic code to be re-used
for geometry shaders). Having brw_vs_compile point to
brw_vs_prog_data makes it difficult to do this cleanly.
Fortunately most of the functions that use brw_vs_compile (those in
the vec4_visitor class) already have access to brw_vs_prog_data
through a separate pointer (vec4_visitor::prog_data). So all we have
to do is use that pointer consistently, and plumb prog_data through
the few remaining functions that need access to it.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch modifies the arguments to brw_compute_vue_map() so that
they no longer bake in the assumption that we are generating a VUE map
for vertex shader outputs. It also makes the function non-static so
that we can re-use it for geometry shader outputs.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The vec4_visitor functions don't use any VS specific data from
vec4_visitor::vp. So rename it to "prog" and change its type from
struct gl_vertex_program * to struct gl_program *. This will allow
the code to be re-used for geometry shaders.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
v2: Use the name "prog" rather than "p".
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The next patch is going to change the type of vec4_visitor::vp from
struct gl_vertex_program * to struct gl_program *, and rename it. The
sensible name to change it to is vec4_visitor::prog. However, prog is
already used in backend_visitor (which vec4_visitor derives from).
Since backend_visitor::prog is of type struct gl_shader_program *, it
makes sense to rename it to shader_prog.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The comment above glsl_type::name claimed that it could sometimes be
NULL. This was wrong--it is never NULL. Many error handling paths
would segfault if it were. (Anonymous structs are assigned names like
"#anon_struct_0001"--see the ast_struct_specifier constructor in
glsl_parser_extras.cpp.)
Fix the comment and add assertions to validate that it really is never
NULL.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just everything you need for UVD with r600g and radeonsi.
v2: move UVD code to radeon subdir, clean up build system additions,
remove an unused SI function, disable tiling on SI for now.
v3: some minor indentation fix and rebased
v4: dpb size calculation fixed
v5: implement proper fall-back in case the kernel doesn't support UVD,
based on patches from Andreas Boll but cleaned up a bit more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Separated from UVD patch for clarity.
v2: sync with next tree for 3.10
v3: as pointed out by Andreas Bool check for drm minor >= 32
http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.10-wip
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
|
|
|
|
|
|
|
|
| |
Reported and tested by degasus on #radeon.
Note: This is a candidate for the 9.1 branch
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The EGLConfig attributes EGL_MIN/MAX_SWAP_INTERVAL were incorrectly set to
0 and 0. This prevented clients from setting the swap interval to a
reasonable value, like 1 or 2.
Swap interval worked correctly in Mesa 9.0. The commit below introduced
the bug.
commit 7e9bd2b2ed35a440a96362417100a7e43715d606
Author: Eric Anholt <eric@anholt.net>
Date: Tue Sep 25 14:05:30 2012 -0700
egl: Add support for driconf control of swapinterval.
Note: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63078
[chadv: Wrote commit message]
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a region is larger than the estimated aperture size, we map/unmap it
by copying with the BLT engine. Which means we can't use Y-tiling.
Fixes Piglit max-texture-size and tex3d-maxsize, which regressed in my
recent change to use Y-tiling by default on Gen6+. This was due to a
botched merge conflict resolution.
v2: Return a mask of valid tilings from intel_miptree_select_tiling.
This allows us to avoid the X-tiling fallback if Y-tiling is actually
mandatory.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
| |
This reduces the nesting level slightly, and in my opinion, makes it a
bit easier to follow.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
| |
We need know this in order to decide what tiling mode to use.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
| |
Reviewed-by: Marek Olšák <maraeo@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
gen7_blorp_emit_depth_stencil_config() is only called when
params->depth.mt is non-null. Therefore, it's not necessary to do an
"if (params->depth.mt)" test inside it. The presence of this if test
was misleading static analysis tools (and briefly, me) into thinking
that gen7_blorp_emit_depth_stencil_config() might sometimes access
uninitialized data and dereference a null pointer.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
| |
Warning: "Conditional jump or move depends on uninitialised value(s)".
|
|
|
|
|
|
|
|
|
|
|
| |
both mov and ucmp can be used to move variables of any type.
correctly note that about ucmp in the tgsi_info and make
sure gallivm can handle that by correctly casting the untyped
moves.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were using simple temporaries, without using alloca or phi
nodes which meant that on every iteration of the loop our
temporaries, which were holding the number of vertices and
primitives which were emitted, were being reset to zero. Now
we're using alloca to allocate those variables to preserve
them across conditionals.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
|
|
|
|
|
|
|
|
|
|
| |
We were missing the implementation of PIPE_QUERY_SO_STATISTICS
query, this change implements it on top of the existing
facilities.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
|
|
|
|
|
|
|
|
|
|
| |
We want to both make sure we never divide by zero to not generate
sigfpe and that divide by zero is guaranteed to return 0xffffffff.
Based on José idea.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
|
|
|
|
|
|
|
|
|
| |
we break when the mask values are 0 not, 1, plus it's bit comparison
not a floating point comparison. This fixes both.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable hiz by setting intel_context::has_hiz. However, to work around
a hardware bug, we selectively enable hiz for only nicely aligned miptree
slices.
No Piglit regressions on Haswell 0x0d26 rev07 when based atop
mesa-master-4ad3601.
Improves the performance of GLB27_TRex_C24Z16_FixedTimeStep by 18.52%
(hsw-0x0d26-rev07; kernel-3.9.0-rc1; GLBenchmark 2.7.0 Release a68901;
samples=3).
v2: Replace the check for IS_HASWELL(devid) in intel_miptree_slice_has_hiz()
with a conditional set of has_hiz. [for anholt]
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
| |
After recent refactorings, the field is written but no longer read.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When appropriate, replace each check `hiz_mt != NULL` with either a call
to intel_miptree_slice_has_hiz() or intel_renderbuffer_has_hiz(). No
behavioral change.
This prepares for selectively enabling hiz on individual miptree slices
for Haswell.
This refactoring had several side effects.
1. To prevent new warnings about discarding the const qualifier,
I removed 'const' from some variable declarations in
intel_validate_framebuffer(). The alternative was to add const
qualifiers to multiple function signatures in the
intel_renderbuffer_has_hiz call graph. Since the dominant convention
in the Intel code is to not qualify function parameters as const,
I chose to remove rather than add const qualifiers.
2. I changed the signature of brw_emit_depth_stencil_hiz() by replacing
`struct intel_mipmap_tree *hiz_mt` with `bool hiz`. The function used
hiz_mt mostly as a boolean indicator of the presence of hiz, so the
signature change is consistent with the patch's goal.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
| |
Add new parameters `depth_level` and `depth_layer`, which specify depth
miptree's slice of interest. A following patch will pass the new
parameters through to intel_miptree_slice_has_hiz().
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
| |
The new fields define the 2D miptree slice to be used. A following patch
will pass the new fields through to intel_miptree_slice_has_hiz().
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Haswell, HiZ will selectively be enabled on individual miptree slices
to workaround a hardware bug. The new field 'has_hiz' indicates if HiZ is
enabled for a given slice.
Also add two new accessor functions for this field.
intel_miptree_slice_has_hiz
intel_renderbuffer_has_hiz
The new field and accessor functions are not yet used. Also, this patch
introduces no behavioral change because, in this patch,
intel_miptree_alloc_hiz() sets has_hiz for all slices.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
| |
The hardware docs and the simulator require that the rectangle primitive
emitted during fast depth clears and hiz resolves must be aligned to 8x4
pixels.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows the computation of the offset to get written directly into the
message source.
shader-db results:
total instructions in shared programs: 3308390 -> 3283025 (-0.77%)
instructions in affected programs: 442998 -> 417633 (-5.73%)
No difference in GLB2.7 low res (n=9).
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have several places in our pull constant handling where we make a
temporary src_reg for an int, and then turn it into a dst. In doing so,
we were writing to the dst.xyzw, so we never register coalesced it with a
later mov from dst.x to real_dst.x.
These extra channels written would be removed if we had channel-wise DCE
in the backend, but we don't. Fix it for now by just not writing these
extra channels that won't get used.
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The software-tracked transform feedback offsets (svbi_0_starting_index)
are incorrect in the presence of primitive restart, so we were actually
updating it with a bogus value if the batch wrapped and we emitted the
packet again during a single transform feedback. By reducing state
emission, we avoid the bug.
Fixes piglit OpenGL 3.1/primitive-restart-xfb flush
Reviewed-by: Paul Berry <stereotype441@gmail.com>
NOTE: This is a candidate for the 9.1 branch.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The software-tracked transform feedback offsets (svbi_0_starting_index)
are incorrect in the presence of primitive restart, so we can't reliably
compute offsets for our buffer pointers after a batch flush. Thanks to HW
contexts, our transform feedback offsets are now saved, so we can just
keep using the ones from before the batch wrap.
Fixes piglit OpenGL 3.1/primitive-restart-xfb flush
Reviewed-by: Paul Berry <stereotype441@gmail.com>
NOTE: This is a candidate for the 9.1 branch.
|
|
|
|
|
|
|
|
| |
v2: fix instrinsic name as well
v3: LLVM revision incremented as well
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
|
|
|
|
| |
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
|
|
|
|
| |
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
| |
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
| |
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
| |
None of these were needed.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
|
| |
Rather than creating a new buffer each time. Fixes problems found
with vtk.
Tested-by: Kevin H. Hobbs <hobbsk@ohio.edu>
|
|
|
|
|
|
| |
This makes sure that ctx->DrawBuffer->Visual.samples is up-to-date.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"ctx->DrawBuffer->Visual" might be invalid if (NewState &_NEW_BUFFERS) != 0.
v2: also fix:
- RGBA_INTEGER_MODE_EXT
- RGBA_FLOAT_MODE_ARB (also check API support)
- FRAMEBUFFER_SRGB_CAPABLE_EXT
NOTE: This is a candidate for stable branches.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen7.5 (Haswell) hardware supports primitive restart for all primitive
types. It also handles all possible primitive restart indices.
Rather than specialize both can_cut_index_handle_restart_index() and
the switch statement in can_cut_index_handle_prims() for Haswell, just
return early if the hardware is Haswell because we know it can handle
everything.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
brw_draw.c contains a trim() function which modifies the vertex count
for quads and quad strips in order to discard dangling vertices. In
principle this shouldn't be necessary, since hardware since Gen4 is
capable of discarding dangling vertices by itself. However, it's
necessary because as a hack to speed up rendering on Gen 4-5, we
sometimes convert quads to trifans and quad strips to tristrips. The
trim() function isn't necessary on Gen6 and up.
This patch documents why and when the trim() function is necessary,
and avoids calling it when it's not needed.
This will avoid creating problems when we enable hardware support for
primitive restart of quads and quad strips on Haswell.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
| |
The call to emit_shader_time_end() before the second URB write was
conditioned with "if (eot)", but eot is always false in this code
path, so emit_shader_time_end() was never being called for vertex
shaders that performed 2 URB writes.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
| |
Drawing subtitles didn't increased the dirty area of the surface.
Reported and tested by freeedrich on irc.
v2: don't clear the surface
Signed-off-by: Christian König <christian.koenig@amd.com>
|