| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
v2: Add name strings in nv50_ir_print.cpp (Ilia Mirkin)
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implementation of readFirstInvocationARB() on nvidia hardware needs a
ballotARB(true) used to decide the first active thread. This expressed
in gm107 asm as (supposing output is $r0):
vote any $r0 0x1 0x1
To model the always true input, which corresponds to the second 0x1
above, we make OP_VOTE accept immediate value 0/1 and emit "0x1" and
"not 0x1" in the src field respectively.
v2: Make sure that asImm() is not NULL (Samuel Pitoiset)
v3: (Ilia Mirkin)
Make the handling more symmetric with predicate version in gm107
Use i->getSrc(s)
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: Make sure that asImm() is not NULL (Samuel Pitoiset)
v3: Check the range of immediate in OP_SHFL (Ilia Mirkin)
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: (Samuel Pitoiset)
Add an assertion to check if the target is Kepler
Make sure that asImm() is not NULL
v3: (Ilia Mirkin)
Check the range of immediate value of OP_SHFL
Use the new setPDSTL API
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GF100's ISA encoding has a weird form of predicate destination where its
3 bits are split across whole the instruction. Use a dedicated setPDSTL
function instead of original defId which is incorrect in this case.
v2: (Ilia Mirkin)
Change API of setPDSTL() to handle cases of no output
Fix setting of the highest bit in setPDSTL()
Cc: [email protected]
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Emit the original hard-coded 0x1c03 when OP_SHFL is used in gm107's
lowering (Samuel Pitoiset)
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
clang::LangAS::Offset is gone, the behaviour is as if it was 0.
v2: Introduce and use clover::llvm::compat::lang_as_offset (Francisco
Jerez)
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
| |
A few functions related to FBOs/renderbuffers should only be used with
window-system buffers, not user-created FBOs. Assert for that.
Add additional comments. No piglit regressions.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We only do on-demand renderbuffer allocation for window-system FBOs,
not user-created FBOs. So put the loop inside a conditional.
Plus, add some comments. No piglit regressions.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Might help reduce cpu for some apps that use sso.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
CXX state_tracker/st_glsl_to_nir.lo
state_tracker/st_glsl_to_nir.cpp:250:57: warning: suggest braces around initialization of subobject [-Wmissing-braces]
nir_lower_wpos_ytransform_options wpos_options = {0};
^
{}
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the Vulkan spec, VkPipelineInputAssemblyStateCreateInfo's
primitiveRestartEnable flag should only apply to indexed draws, however
it was being enabled regardless of the type of draw. This could cause
problems for non-indexed draws with >=65535 vertices if the previous
indexed draw used 16-bit indices.
Fixes corruption of the credits text in Mad Max.
v2: Reset primitive restart state after executing a secondary command
buffer.
Signed-off-by: Alex Smith <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
This reverts commit 61e47d92c5196bf0240e322bb1b9d305836559e3.
It causes a hang on RS780.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100663
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
Since the shader code can include them.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Makes more sense when we hash the layout for the pipeline cache.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
The outer result was referred to, which meant bugs.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Fixes: 8475a14302e ("radv: Implement pipeline statistics queries.")
Reviewed-by: Fredrik Höglund <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Fixes: 8475a14302e ("radv: Implement pipeline statistics queries.")
Reviewed-by: Fredrik Höglund <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Context: _mesa_add_parameter is sometimes[0] called with a
NULL name as a mean of an unnamed parameter.
Allowing NULL pointer as a name means that it must be NULL checked
each access. So far it isn't always[1] true.
Parameter name is only used for debug purpose (printf) and
to lookup the index/location of the program by the application.
Conclusion, there is no valid reason to use a NULL pointer instead of
an empty string. So it was decided to use an empty string which avoid all
issues related to NULL pointer
[0]: texture gather offsets glsl opcode and st_init_atifs_prog
[1]: at least shader cache, st_nir_lookup_parameter_index and some printfs
Issue found by piglit 'texturegatheroffsets' tests on Nouveau
v4: new patch based on Nicolai/Timothy/ilia discussion
Signed-off-by: Gregory Hainaut <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
These one bit values are booleans.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
unsigned long is a terrible type for a bitfield - if you need fewer
than 32 bits, it wastes 4 bytes. If you need more, things break on
32-bit builds. Just use unsigned.
Even that's a bit ridiculous as we only have one flag today.
Still, it's at least somewhat better.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The drm_i915_gem_create ioctl structure uses a __u64 for the size,
so we should probably use uint64_t to match. In theory, we could
probably have a BO larger than 4GB, using a 48-bit PPGTT - it just
wouldn't be mappable in the CPU's 32-bit address space.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Theoretically, with a 48-bit address space, we could have buffers
with an alignment of >= 4GB. It's a bit silly, but the exec_object
structs (drm_i915_gem_exec_object2) use a __u64 for this, so we may
as well use the same type as the kernel API.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
| |
struct drm_i915_gem_set_tiling's stride field is a __u32.
intel_mipmap_tree::stride is a uint32_t. Using unsigned long just
doesn't make sense. Switching also lets us drop many pointless
locals that only existed to deal with the type mismatch.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
| |
The ioctl structs contain __u64 offset and size fields, so make them
uint64_t rather than unsigned long.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For some reason we passed tiling by pointer, through several layers,
even though the functions only read the initial value, and never
actually change it. We even had a do-while loop that executed until
the tiling mode matched - except it always did, so it only ran once.
We then had bogus error handling in case it changed the tiling mode
to something nonsensical...which it never did.
Drop all this nonsense.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
These calls look like leftover from fallback texture support first
being added to the st in 8f6d9e12be0be and then later being added
to core mesa in 00e203fe17cbf21.
The piglit test fp-incomplete-tex continues to work with this
change.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
The key is just an unsigned int so there is never any real hashing
done.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Fix PA NextPrim for SIMD8 on SIMD16.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Implement widened clipper for SIMD16.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Quick patch to remove some unused template params to cut down
rasterizer compile time.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cost heuristic.
The individual branches of an if/else/endif construct will be executed
some unknown number of times between 0 and 1 relative to the parent
block. Use some factor in between as weight while approximating the
cost of spill/fill instructions within a conditional if-else branch.
This favors spilling registers used within conditional branches which
are likely to be executed less frequently than registers used at the
top level.
Improves the framerate of the SynMark2 OglCSDof benchmark by ~1.9x on
my SKL GT4e. Should have a comparable effect on other platforms. No
significant regressions.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We check these bitfields when computing the Haswell max GL version.
We need to set them ahead of time, or they won't exist, and all our
checks will fail. That sets the max core profile GL version to 4.2.
This introduces the bizarre situation where asking for a GL context
with version 4.3+ fails, but asking for a GL core profile context
with version <= 4.2 actually promotes you a 4.5 context.
GLX_MESA_query_renderer also reported the bogus 4.2 value.
Now it shows 4.5.
Cc: "17.0" <[email protected]>
Reported-and-tested-by: Rafael Ristovski <[email protected]>
|
|
|
|
|
|
| |
This is already invoked in the following VG_NOACCESS_READ() call.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Autodisable seems to cause missed rendering in some cases, but
otherwise TS seems to work properly.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Wladimir J. van der Laan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes a performance issue with imported winsys buffers as those are
marked with binding sampler view.
This might require a TS flush on single pipe chips that directly
sample from the rendered buffer, but otherwise seems to work fine.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Wladimir J. van der Laan <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The TS surface gets cleared by a tiled RS fill. If the chip has
more than 1 pixel pipe the size of the TS surface needs to be
aligned so that each pipe address matches a tile start, otherwise
the RS will hang.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Wladimir J. van der Laan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The TS is only valid after it has been initialized by a fast
clear, so it should not be taken into account when blitting
resources that haven't been cleared. Also the blit itself
invalidates the destination TS, as it's not updated and will
retain data from the previous rendering after the blit.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Wladimir J. van der Laan <[email protected]>
|