| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
We don't need to allocate all the state related to GL_ARB_debug_output
until some aspect of that extension is actually needed.
The sizeof(gl_debug_state) is huge (~285KB on 64-bit systems), not even
counting the 54(!) hash tables and lists that it contains. This change
reduces the size of gl_context alone from 431KB bytes to 145KB bytes on
64-bit systems and from 277KB bytes to 78KB bytes on 32-bit systems.
Reviewed-by: Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This makes it obvious which number is which.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
The WHILE instruction doesn't have UIP. It only has JIP.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The bits which normally contain the source register descriptions
actually contain the JIP/UIP jump targets, which we already printed.
Interpreting JIP/UIP as source registers results in some really creepy
looking output, like IF statements with acc14.4<0,1,0>UD sources.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Broadwell's 3DSTATE_CLEAR_PARAMS packet expects a floating point value
regardless of format. This means we need to stop converting it to
UNORM.
Storing the value as float would make sense, but since we already have a
uint32_t field, this patch continues shoehorning it into that. In a
sense, this makes mt->depth_clear_value the DWord you emit in the
packet, rather than the clear value itself.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
We've had various bug reports over the years where miptrees are missing,
and when I screwed it up while adding DRI2 to the modesetting driver, I
figured I should put the info necessary for debug here.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
The intel_miptree_blit() code checks the format for us now, plus it
handles xrgb vs argb for us.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it work on Broadwell, too.
v2: Drop bogus double write to 3DPRIM_BASE_VERTEX register
(caught by Chris Forbes).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This saves some boilerplate and hides the OUT_RELOC/OUT_RELOC64
distinction.
Placing the function in intel_batchbuffer.c is rather arbitrary; there
wasn't really an obvious place for it.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Since commit 9cee3ff562f3e4b51bfd30338fd1ba7716ac5737, INTEL_DEBUG=vs
has caused a NULL pointer dereference for fixed-function/ARB programs.
In the vec4 generators, "prog" is a gl_program, and "shader_prog" is the
gl_shader_program. This is different than the FS visitor.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Juha-Pekka Heikkila <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This regressed when I converted BRW_REGISTER_TYPE_* to be an abstract
type that doesn't match the hardware description. dump_instruction()
was using reg_encoding[] from brw_disasm.c, which no longer matches
(and was incorrect for Gen8+ anyway).
This patch introduces a new function to convert the abstract enum values
into the letter suffix we expect.
Signed-off-by: Kenneth Graunke <[email protected]>
Reported-by: Matt Turner <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
If multiple color outputs are written, this shader is unlikely to be
useful with a winsys framebuffer.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The driver is supposed to ensure buffers before any drawing operation, but in
do_blit_drawpixels() and do_blit_copypixels() we inspect the buffer format
before calling intel_prepare_render(). That was covered up by the
unconditional call to intel_prepare_render() in intelMakeCurrent(), but we
now only do this on the initial intelMakeCurrent call for a context
(to get the size for the initial viewport values).
https://bugs.freedesktop.org/show_bug.cgi?id=74083
Signed-off-by: Kristian Høgsberg <[email protected]>
Tested-by: Alexander Monakov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This will allow testing of compute shader functionality before it is
completed.
To enable ARB_compute_shader functionality in the i965 driver, set
INTEL_COMPUTE_SHADER=1.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
| |
v2: Fix comment.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is really not needed as blorp blit programs already sample
XRGB normally and get alpha channel set to 1.0 automatically by
the sampler engine. This is simply copied directly to the payload
of the render target write message and hence there is no need for
any additional blending support from the pixel processing pipeline.
The blending formula is anyway broken for color components, it
multiplies the color component with itself (blend factor is the
component itself).
Alpha blending in turn would not fix the alpha to one independent
of the source but simply used the source alpha as is instead
(1.0 * src_alpha + 0.0 * dst_alpha).
Quoting Eric:
"If we want to actually make the no-alpha-bits-present thing work,
we need to override the bits in the surface state or in the
generated code. In the normal draw path, it's done for sampling
by the swizzling code in brw_wm_surface_state.c, and the blending
overrides is just to fix up the alpha blending stage which
doesn't pay attention to that for the destination surface."
If one modifies piglit test gl-3.2-layered-rendering-blit to use
color component values other than zero or one, this change will
kick in on IVB. No regressions on IVB.
This is effectively revert of c0554141a9b831b4e614747104dcbbe0fe489b9d:
i965/blorp: Support overriding destination alpha to 1.0.
Currently, Blorp requires the source and destination formats to be
equal. However, we'd really like to be able to blit between XRGB and
ARGB formats; our BLT engine paths have supported this for a long time.
For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
interpreted as 1.0. For XRGB -> ARGB, we need to smash the alpha
channel to 1.0 when writing the destination colors. This is fairly
straightforward with blending.
For now, this code is never used, as the source and destination formats
still must be equal. The next patch will relax that restriction.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves the intel_batchbuffer_flush before the drm_intel_bo_busy
call, which is a change in behavior. However, the old behavior was
broken.
In the future, we may want to only flush in the batchbuffer references
the BO being mapped. That's certainly more typical.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This additionally measures the time stalled, while also simplifying the
code.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mapping a buffer is a common place where we could stall the CPU.
In a few places, we've added special code to check whether a buffer is
busy and log the stall as a performance warning. Most of these give no
indication of the severity of the stall, though, since measuring the
time is a small hassle.
This patch introduces a new brw_bo_map() function which wraps
drm_intel_bo_map, but additionally measures the time stalled and reports
a performance warning. If performance debugging is not enabled, it
simply maps the buffer with negligable overhead.
We also add a similar wrapper for drm_intel_gem_bo_map_gtt().
This should make it easy to add performance warnings in lots of places.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
_mesa_update_vao_client_arrays() is less of a mouthful than
_mesa_update_array_object_client_arrays(), and generally clearer.
Generated by:
$ find . -type f -print0 | xargs -0 sed -i \
's/_mesa_\([^_]*\)_array_object/_mesa_\1_vao/g'
with manual whitespace and indentation fixes applied.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I considered replacing it with "gl_vao", but spelling it out seemed to
fit better with Mesa's traditional style. Mesa doesn't shy away from
long type names - consider gl_transform_feedback_object,
gl_fragment_program_state, gl_uniform_buffer_binding, and so on.
Completely generated by:
$ find . -type f -print0 | xargs -0 sed -i \
's/gl_array_object/gl_vertex_array_object/g'
v2: Rerun command to resolve conflicts with Ian's meta patches.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When reading through the Mesa drawing code, it's not immediately obvious
to me that "ArrayObj" (gl_array_object) is the Vertex Array Object (VAO)
state. The comment above the structure explains this, but readers still
have to remember this and translate accordingly.
Out of context, "array object" is a fairly vague. Even in context,
"array" has a lot of meanings: glDrawArrays, vertex data stored in user
arrays, gl_client_arrays, gl_vertex_attrib_arrays, and so on.
Using the term "VAO" immediately associates these fields with the OpenGL
concept, clarifying the situation and aiding programmer sanity.
Completely generated by:
$ find . -type f -print0 | xargs -0 sed -i \
-e 's/ArrayObj;/VAO;/g' \
-e 's/->ArrayObj/->VAO/g' \
-e 's/Array\.ArrayObj/Array.VAO/g' \
-e 's/Array\.DefaultArrayObj/Array.DefaultVAO/g'
v2: Rerun command to resolve conflicts with Ian's meta patches.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Silences many GCC warnings of the form:
drivers/common/meta.c: In function 'cleanup_temp_texture':
drivers/common/meta.c:1208:41: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'setup_ff_blit_framebuffer':
drivers/common/meta.c:1453:46: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'meta_glsl_blit_cleanup':
drivers/common/meta.c:1998:43: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'meta_glsl_clear_cleanup':
drivers/common/meta.c:2287:44: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'setup_ff_generate_mipmap':
drivers/common/meta.c:3365:45: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'meta_glsl_generate_mipmap_cleanup':
drivers/common/meta.c:3556:54: warning: unused parameter 'ctx' [-Wunused-parameter]
There are a couple other similar warnings, but they are less trivial. I
want to investigate these further before axing them.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Array textures can't be used with fixed-function, so don't. Instead,
just drop the decompress request on the floor. This is no worse than
what was done previously because generating the GL error (in
_mesa_set_enable) broke everything anyway.
A later patch will get GL_TEXTURE_2D_ARRAY targets working.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
There is no need to use pixel coordinates, and using NDC directly will
simplify the GLSL paths.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
For these objects, meta was already using the non-Apple function to
delete the objects. Everywhere else in the file uses
_mesa_GenVertexArrays and _mesa_BindVertexArrays.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Cc: "9.1 9.2 10.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GL_TEXTURE_CUBE_MAP_ARRAY
The hardware decompression path isn't even close to being able to handle
this. This converts the crash (assertion failure) in
"EXT_texture_compression_s3tc/getteximage-targets S3TC CUBE_ARRAY" to a
plain old failure.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Cc: "9.1 9.2 10.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
_mesa_meta_DrawPixels creates a VAO and (potentially) two fragment
programs, but none of them are ever released. Leaking piles of memory
is generally frowned upon.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Cc: "9.1 9.2 10.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
decompress_texture_image creates an FBO, an RBO, a VBO, a VAO, and a
sampler object, but none of them are ever released. Later patches will
add program objects, exacerbating the problem. Leaking piles of memory
is generally frowned upon.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Cc: "9.1 9.2 10.0" <[email protected]>
|
|
|
|
|
|
| |
Not really used anywhere.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Somehow I missed this before pushing the Broadwell PS state upload code.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the final revision of my gen8_generator patch, I updated the MATH
instruction's assertion from (dst.hstride == 1) to check that source and
destination hstride matched. Unfortunately, I didn't test this enough,
and many Piglit tests fail this test.
The documentation indicates that "scalar source is also supported",
which we believe means <0,1,0> access mode (hstride == 0). If hstride
is non-zero, then it must match the destination register.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Eric believes this to be wrong and unnecessary, as the command is
supposed to emit an implicit rectangle primitive. However, empirically
the pixel pipeline is completely unreliable without it. So for now, it
stays until someone comes up with a better solution.
We'll need to do better than this when we implement multisampling, HiZ,
or fast clears...but for now, this will do.
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is quite similar to the Gen7 code. The main changes:
- 48-bit relocations
- Thread count is specified as U/2-1 instead of U-1.
- An extra DWord (DW9) with clip planes, URB entry output length/offsets
- We need to program the "Expected Vertex Count" (VerticesIn)
v2: Set the number of binding table entries so they can be prefetched
(requested by Eric Anholt).
v3: Add a WARN_ONCE for a missing workaround.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On previous platforms, 3DSTATE_MULTISAMPLE contained the number of
samples, pixel location, and the positions of each sample within a pixel
for each multisampling mode (4x and 8x). It was also a non-pipelined
command, presumably since changing the sample positions is fairly
drastic.
Broadwell improves upon this by splitting the sample positions out into
a separate non-pipelined state packet, 3DSTATE_SAMPLE_PATTERN. With
that removed, 3DSTATE_MULTISAMPLE becomes a pipelined state packet.
Broadwell also supports 2x and 16x multisampling, in addition to the 4x
and 8x supported by Gen7. This patch, however, does not implement 2x
and 16x.
Signed-off-by: Kenneth Graunke <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The amount of cut and paste from Gen7 is rather ugly, and should
probably be cleaned up in the future. Even the Gen7 code is in need of
some tidying though; many of the function parameters aren't used on
platforms that use level/layer rather than tile offsets. Tidying both
can be left to a future patch series. This at least gets things going.
v2: Rebase on Paul's rename of NumLayers -> MaxNumLayers.
v3: Shift QPitch by 2 when storing it in the packet. Bits 14:0 store
bits 16:2 of the actual value. Fixes tests.
v4: Add missing stencil buffer QPitch.
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Allow logic ops on all surface types. The UNORM restriction was
lifted with Haswell and I simply hadn't noticed. Also, add missing
BRW_NEW_STATE_BASE_ADDRESS dirty bit. Both caught by Eric Anholt.
v3: Fix swapped per-RT DWord pairs. Eliminates bizarre hacks.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
It has additional fields to support clipping to the viewport even if
guardband clipping is enabled.
v2: Update for viewport array changes.
v3: No, seriously, update for viewport array changes.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]> [v1]
|