| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
We need to know the local/input/private sizes and others. This is not
complete. We need many others for CURBE setup.
Signed-off-by: Chia-I Wu <[email protected]>
|
|
|
|
|
|
| |
Based on beignet, hardware capabilities, and OpenCL requirements.
Signed-off-by: Chia-I Wu <[email protected]>
|
|
|
|
|
|
|
| |
They will be used to report compute params or program compute states.
thread_count can also be used for 3DSTATE_VS.
Signed-off-by: Chia-I Wu <[email protected]>
|
|
|
|
|
|
| |
drm_intel_gem_bo_wait() with negative timeout is broken on kernel 3.17.
Signed-off-by: Chia-I Wu <[email protected]>
|
|
|
|
| |
Allows the driver to advertise DMA-BUF and throttling.
|
|
|
|
|
|
|
|
|
|
|
| |
Should trigger CL_INVALID_VALUE if device_list is NULL and num_devices
is greater than zero.
Introduced by e5468dfa523be2a7a0d04bb9efcf8ae780957563
Reported by: EdB
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Between release 3.2 and 3.3 LLVM stopped aligning properly when certain
conditions (no allocas, but large number of vectors causing spills to
the stack, and frame pointer omission enabled).
We were already disabling frame-pointer-omission on several build types,
but we now disable it on all build types.
It's not clear whether this affects 32-bits x86 processes only, or if it
can also affect 64-bits x86_64 processes when AVX registers are
available and used. So disable frame-pointer-omission on both
x86/x86_64 to be on the safe side.
See also:
- http://llvm.org/PR21435
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
To help recognize what's supposed to do.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Need to do a sqrt().
FWIW, the html that Sphinx 1.1.3 generates for the math expressions
looks completely broken.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
| |
To match tgsi_alloc_tokens().
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Use the new helper functions in the tgsi_transform.h file to emit
declarations and instructions.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
| |
Pass and return tgsi_token buffers instead of pipe_shader_state.
And update softpipe driver (the only user of this function).
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Fixes polygon stipple if both DO_PSTIPPLE_IN_DRAW_MODULE and
DO_PSTIPPLE_IN_HELPER_MODULE are zero/off.
Reviewed-by: Charmaine Lee <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was a regression introduced by
611d66fe4513e53bde052dd2bab95d448c909a2a
Passing a binary program to clBuildProgram() is legal, but passing one
to clCompileProgram() is not.
v2:
- Code cleanups.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This factors out the validation that is common with clBuildProgram().
v2:
- Code cleanups.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
| |
v2:
- Drop dependency on LLVM >= 3.5.1
- Rename si_create_shader() to si_shader_binary_read()
|
|
|
|
|
| |
v2:
- Drop dependency on LLVM >= 3.5.1
|
|
|
|
|
|
|
| |
This adds a query which allows drivers to access the config
information of a specific function within the LLVM generated ELF
binary. This makes it possible for the driver to handle ELF
binaries with multiple kernels / global functions.
|
|
|
|
|
|
| |
It's annoying with octave. Reported by Michael Burian.
Cc: 10.2 10.3 <[email protected]>
|
|
|
|
| |
Signed-off-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This prevents us from silently overflowing the stack arrays, and allows
arbitrary stack depths.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85454
Cc: [email protected]
Reported-and-Tested-by: Nick Sarnie <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Fixes 14 ARB_vp tests (which had no lowering done), and should improve
performance of indirect uniform array access in GLSL.
|
| |
|
|
|
|
| |
This function is only called when it would return true.
|
|
|
|
|
| |
This signal doesn't terminate the program now, it terminates the program
soon. So you have to actually validate the code in the instruction.
|
| |
|
| |
|
|
|
|
|
| |
Signed-off-by: David Heidelberger <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
I forgot that we cannot emit vertex shader state on a chip without VS.
In such a case, clip_halfz is handled by the Draw module.
|
|
|
|
|
| |
Cc: 10.2 10.3 [email protected]
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
Fixes piglit/polygon-mode-offset.
Cc: 10.2 10.3 [email protected]
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
Fixes piglit/polygon-mode-offset.
Cc: 10.2 10.3 [email protected]
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Caveat: Shaders using UBO/sampler indexing will
not be optimized by SB, due to SB not currently
supporting the necessary CF_INDEX_[01] index
registers.
Signed-off-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
| |
Requires evergreen/cayman
Signed-off-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
|
| |
This enables ARB_conditional_render_inverted.
Signed-off-by: Tobias Klausmann <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive
scalar registers. Keep track of instruction neighbors in copy-
propagation step and avoid eliminating mov's which would cause an
instruction to need multiple distinct left and/or right neighbors.
This lets us not fall on our face when we encounter things like:
1: MOV TEMP[2], IN[0].xyzw
2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D
3: MOV TEMP[2].xy, IN[0].yxzz
4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D
5: END
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Always insert extra mov's for the tex coord into the fanin. This
simplifies things a bit, and avoids a scenario where multiple sam
instructions can have mutually exclusive input's to it's fanin, for
example:
1: TEX OUT[0].xy, IN[0].xyxx, SAMP[0], 2D
2: TEX OUT[0].zw, IN[0].yxxx, SAMP[0], 2D
The CP pass can always remove the mov's that are not actually needed,
so better to start out with too many mov's in the front end, than not
enough.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
dscis -> noscis
dbypass -> nobypass
a bit more consistant w/ nobin, etc. And IMO a bit more sensible names.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Kills get added to the outputs list, to ensure they get scheduled. But
they aren't *really* outputs so skip them in the header comment block.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to test compiler changes more easily, spit out the assembled
shader with some header information so that we can know about
inputs/outputs more easily.
See: git://people.freedesktop.org/~robclark/ir3test
In ir3test we have a big collection of tgsi shaders and reference
ir3_compiler outputs. When making compiler changes, regenerate the
compiler outputs and feed to ir3test to compare the new vs reference
shader.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
The last few dwords were skipped if the total number of dwords was not a
multiple of 4. Change the formatting for better readability.
Signed-off-by: Chia-I Wu <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes:
- https://bugs.freedesktop.org/show_bug.cgi?id=85377
- http://llvm.org/bugs/show_bug.cgi?id=21365
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
So that the order of test messages and gallivm/llvmpipe debug output is
preserved.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In preparation of ARB_clip_control. Let the driver decide if
it supports pipe_rasterizer_state::clip_halfz being set to true.
v3:
Initially enable on ilo.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Mathias Froehlich <[email protected]
|
|
|
|
|
|
|
|
|
|
| |
This allows vc4_opt_cse.c to CSE-away operations involving the same
uniform values.
total instructions in shared programs: 37341 -> 36906 (-1.16%)
instructions in affected programs: 10233 -> 9798 (-4.25%)
total uniforms in shared programs: 10523 -> 10320 (-1.93%)
uniforms in affected programs: 2467 -> 2264 (-8.23%)
|
|
|
|
|
|
|
| |
This saves a bunch of extra flushes when texsubimaging a whole texture
that's been used for rendering, or subdataing a whole BO. In particular,
this massively reduces the runtime of piglit texture-packed-formats (when
the probes have been moved out of the inner loop).
|