| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extends the brw_oa_hsw.xml to expose these additional queries:
- Compute Metrics Basic Gen7.5
- Compute Metrics Extended Gen7.5
- Memory Reads Distribution Gen7.5
- Memory Writes Distribution Gen7.5
- Metric set Sampler Balance
Signed-off-by: Robert Bragg <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for exposing basic Observation Architecture
performance counters on Haswell.
This support is based on the i915 perf kernel interface which is used
to configure the OA unit, allowing Mesa to emit MI_REPORT_PERF_COUNT
commands around queries to collect counter snapshots.
To take into account the small chance that some of the 32bit counters
could wrap around for long queries (~50 milliseconds for a GT3 Haswell @
1.1GHz) the implementation also collects periodic metrics.
Signed-off-by: Robert Bragg <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Avoiding lots of error prone boilerplate and easing our ability to add +
maintain support for multiple OA performance counter queries for each
generation:
This adds a python script to generate code for building up
performance_queries from the metric sets and counters described in
brw_oa_hsw.xml as well as functions to normalize each counter based on
the RPN expressions given.
Although the XML file currently only includes a single metric set, the
code generated assumes there could be many sets.
The metrics as described in XML get translated into C structures
which are registered in a brw->perfquery.oa_metrics_table hash table
keyed by the GUID of the metric set in XML.
v2: numerous python style improvements (Dylan)
v3: Makefile.am fixups (Emil)
v4: Pattern rule for codegen + orthogonal .c and .h rules (Robert)
Signed-off-by: Robert Bragg <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In preparation for generating code from brw_oa_hsw.xml for describing OA
performance counter queries this adds some OA specific members to
brw_perf_query that our generated code will initialize:
- The oa_metric_set_id is the ID we will pass to
DRM_IOCTL_I915_PERF_OPEN, and is an ID got via sysfs under:
/sys/class/drm/<card>/metrics/<guid/id
- The oa_format is the OA report layout we will request from the kernel
- The accumulator offsets determine where the different groups of A, B
and C counters are located within an intermediate 64bit 'accumulator'
buffer.
Additionally brw_perf_query_counter now has 64bit or float _read()
callback members for OA counters.
Signed-off-by: Robert Bragg <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In preparation for generating code from the XML performance counter meta
data, this makes some additions to brw_context.h for this code to be
able to reference.
It adds a brw->perfquery.oa_metrics_table hash table for indexing built
up query descriptions by the GUID that is expected to be advertised by
the kernel (via sysfs) to be able to use that query.
It adds an 'OA_COUNTERS' brw_query_kind to be assigned to queries built
up by generated code.
It adds a brw->perfquery.sys_vars structure to have a consistent place
to represent the different system variables like $EuCoresTotalCount and
$EuSlicesTotalCount that are referenced by OA counter normalization
equations.
Although extending + referencing gen_device_info for these variables
was considered, these are some of the (mostly minor) reasons for
going with a dedicated structure:
- Currently we only need this info for the performance_query backend
and it might be a bit tedious to go back and initialize the state
for pre-Haswell devinfo structures.
- Considering the $SubsliceMask then the requirement for how multiple
per-slice masks are packed only comes from how the variables are
references by availability tests in XML, and might not be a good
general representation for tracking subslice masks if another use
case arises.
- If we used gen_device_info then we'd likely want to avoid making
assumptions about the C types during codegen and adding explicit
casts, while that's not necessary with a dedicated struct with all
members being uint64_t.
- This structure and the code for initializing it is currently shared
(just through copy & paste) with a few other projects dealing with
OA counters, and that's been convenient so far.
Signed-off-by: Robert Bragg <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In preparation for exposing Gen Observation Architecture performance
counters via INTEL_performance_query this adds an XML description for an
initial 'Render Metrics Basic Gen7.5' query and corresponding counters.
The intention is to auto generate code for building a query from these
counters as well as the code for normalizing the individual counters.
Note that the upstream for this XML data is currently GPU Top:
https://github.com/rib/gputop
The files are maintained under gputop-data/ and they are themselves
derived from files in an internal 'MDAPI XML' schema. There are scripts
under gputop-scripts/ and make rules in gputop-data/Makefile.xml for
maintaining these files.
Signed-off-by: Robert Bragg <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
There is no need to check sampler == 0 twice. This removes now
unused _mesa_lookup_samplerobj_locked().
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This was a hook I came up when trying to do the initial performance
counter work years ago. Nothing's used it for a long time, and the
upcoming performance counter support doesn't want it either.
So, goodbye render ring prelude.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With mesa/drm commit cd2f91e18db087edf93fed828e568ee53b887860
Author: Kristian Høgsberg Kristensen <[email protected]>
Date: Fri Jul 31 10:47:50 2015 -0700
intel: Drop aub dumping functionality
the drm_intel_aub routines are mere stubs and do nothing. Likewise
remove our invocations.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We will have already loaded the uniforms when the parameter list
was restored from cache.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This returns a pointer, not a boolean. No actual effect, but cleaner.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
screen->devinfo.gen is annoying to type and linewrap.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We never actually used the resource streamer in any shipping build
of Mesa. We have no plans to do so in the future. We looked into
using it in Vulkan, and concluded that it was unusable. We're not
the only ones to arrive at the conclusion that it's not worth using.
So, drop the last vestiges of resource streamer support and move on.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
We've updated our libdrm requirement, and it will already provide these.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Oops.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100088
Fixes: 5ae54c0cf7 ("getteximage: avoid to lookup textures with id 0")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
_mesa_lookup_samplerobj() returns NULL if sampler is 0.
v2: use _mesa_lookup...(...) != NULL
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This fixes the following assertion when the key is 0.
main/hash.c:181: _mesa_HashLookup_unlocked: Assertion `key' failed.
Fixes: 633c959fae ("getteximage: Return correct error value when texure object is not found")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The OpenGL 4.5 specification's description of TexBuffer says:
"The number of texels in the texture image is then clamped to an
implementation-dependent limit, the value of MAX_TEXTURE_BUFFER_SIZE."
We set GL_MAX_TEXTURE_BUFFER_SIZE to 2^27. For buffers with a byte
element size, this is the maximum possible size we can encode in
SURFACE_STATE. If you bind a buffer object larger than this as a
texture buffer object, we'll exceed that limit and hit an isl assert:
assert(num_elements <= (1ull << 27));
To fix this, clamp the size in bytes to MaxTextureSize / texel_size.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Equivalent *TexSubImage* methods generates INVALID_ENUM.
From OpenGL 4.5 spec, section 8.6 Alternate Texture Image
Specification Commands:
"An INVALID_ENUM error is generated by *TexSubImage* if target does
not match the command, as shown in table 8.15."
And:
"An INVALID_OPERATION error is generated by *TextureSubImage* if
the effective target of texture does not match the command, as
shown in table 8.15."
Fixes:
GL45-CTS.direct_state_access.textures_copy_errors
v2: slightly change commit summary (Samuel)
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Fixes: 7ac47b1af767 ("i965: Add a header for brw_vec4_vs_visitor")
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The is_color_attachement variable is later read when handling two
separate error cases, where only one of the cases results in the
variable being initialized.
This can be avoided by giving the variable a safe default value.
Coverity-Id: 1398631
Cc: [email protected]
Signed-off-by: Robert Foss <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In validate_DrawElements_common() we need to check for OES_geometry_shader
extension to determine if we should fail if transform feedback is
unpaused. However current code reads ctx->Extensions.OES_geometry_shader
directly, which does not take context version into account. This means
that if the context is GLES 3.0, which makes the OES_geometry_shader
inapplicable, we would not validate the draw properly. To fix it, let's
replace the check with a call to _mesa_has_OES_geometry_shader().
Fixes following dEQP tests on i965 with a GLES 3.0 context:
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_incomplete_primitive
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_instanced
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_instanced_incomplete_primitive
dEQP-GLES3.functional.negative_api.vertex_array#draw_range_elements
dEQP-GLES3.functional.negative_api.vertex_array#draw_range_elements_incomplete_primitive
Signed-off-by: Tomasz Figa <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
One less set of enums. Dropped the #defines from brw_defines.h and ran:
$ for file in *.cpp *.c *.h; do sed -i \
-e 's/BRW_SURFACEFORMAT_/ISL_FORMAT_/g' \
-e 's/ISL_FORMAT_ASTC_[A-Zxs0-9_]*/\U&/g' $file; \
done
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we don't have pipelined register access (e.g. Haswell before kernel
v4.2), then we can only implement EXT_transform_feedback by reseting the
SO offsets *between* batches. However, if we do have pipelined access to
the SO registers on gen7, we can simply emit an inline reset of the SO
registers without a full batch flush.
v2 [by Ken]: Simplify after recent kernel feature detection changes.
Signed-off-by: Chris Wilson <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is shared between the Vulkan and GL drivers as it's a requirement
of the back-end compiler. However, it doesn't really belong in the
compiler. We rename the file to match the prefix of the other stuff in
common and because libdrm defines an intel_debug.h and this avoids a
pile of possible name conflicts.
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
This hasn't been used for quite some time now but we never bothered to
get rid of it when we dropped GLSL IR support for vec4.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
brw_vs.h is not a compiler file but brw_vec4_visitor is definitely a
compiler thing.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
It's all GL-specific and brw_program.h is not part of i965_compiler.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
It's the only thing that's using it.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
They're GL-specific.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
One of these days, I'd like to see this function go away all together
but for now, let's at least put it near the struct it updates.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It does sort-of go with MAX_UBO and friends but MAX_DRAW_BUFFERS is an
actual hardware constant based on the number of things we can blend
rather than an arbitrary "number of things allowed in GL" like some of
the other maximums are.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
It's a mesa define that's trivial to inline. This removes a dependence
on main/imports.h.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Its one and only caller is brw_compile_fs which lives there.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
While we're at it, we also change the GEN6 binding macro to be a start
index that gets added to the binding. This makes things a bit more
explicit.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
It's only users are in brw_vec4_gs_visitor and gen6_vec4_gs_visitor.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It's currently in brw_util.c but that's the only bit of brw_util.c
that's shared between the compiler and the rest of the GL driver.
It's just a fairly obvious table so the duplication isn't bad. It's
certainly less pain than trying to figure out how to share the code.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Vulkan doesn't respect MAX_SURFACES so this assert isn't valid in that
case. It should, however, assert that it isn't insanely large.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This is a relic of when we wired up meta to be able to use RECTLIST
primitives. It's no longer needed.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This isn't used by Vulkan and is specific to the way the GL driver
works. There's no reason to have it in common compiler code. Also, it
relies on BRW_MAX_* defines which are defined in brw_context.h
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
| |
These go in wm_prog_key so they're part of the compiler interface.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|