summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* configure.ac: Use AX_GCC_BUILTIN to check availability of __builtin_bswap32 v2Tom Stellard2014-02-243-1/+174
| | | | | | | v2: - Remove unnecessary AC_SUBST Reviewed-by: Matt Turner <[email protected]>
* targets/opencl: resolve undefined symbols at link timeEmil Velikov2014-02-241-0/+1
| | | | | | | | Current automake build does not try to resolve undefined symbols thus we could end up with a broken library. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* gallium/targets: resolve undefined reference to pipe_loader_sw_probe_driEmil Velikov2014-02-243-0/+15
| | | | | | | | | | | | | With the introduction of the pipe_loader_sw_probe_dri helper we require the sw/dri winsys during linking stage despite it being unused by any of the targets. This will cause a minor increase in the resulting library which will be cleaned up via linker options with upcoming patches. v2: Link with libswdri.la only when available. Reported-and-tested-by: Tom Stellard <[email protected]> (v1) Signed-off-by: Emil Velikov <[email protected]>
* configure: correctly report if we're building the sw/xlib winsysEmil Velikov2014-02-241-0/+1
| | | | | | | | | While looking at bug 75356, I've noticed that the presence of x11 egl platform pulls in sw/xlib as "needed" but fails to report so at the end of configure. Tested-by: Tom Stellard <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* pipe-loader: wrap pipe_loader_sw_probe_xlib within HAVE_PIPE_LOADER_XLIBEmil Velikov2014-02-248-7/+41
| | | | | | | | | | | | | | | | | | | The above function implies using the the xlib winsys, which has additional library dependencies that should not be forced. Make the software xlib pipe loader optional thus avoid all the dependency hell. A user that wishes to use the particular pipe-loader would need to set the following within configure.ac. enable_gallium_xlib_loader=yes v2: - Wrap sw/xlib/xlib_sw_winsys.h to handle compilation on systems lacking X11 headers. Spotted by Christian Prochaska. Tested-by: Tom Stellard <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75356 Signed-off-by: Emil Velikov <[email protected]>
* targets/gbm: exit gracefully if pipe_loader_drm_probe_fd is not availableEmil Velikov2014-02-241-3/+4
| | | | | | | | | | When one builds without gallium_drm_loader, the above function will not be available, thus we'll segfault in gallium_screen_create due to memory access violation. Tested-by: Tom Stellard <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75335 Signed-off-by: Emil Velikov <[email protected]>
* i965: Don't try to use the hardware blitter for multisampled miptrees.Kenneth Graunke2014-02-231-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | The blitter is completely ignorant of MSAA buffer layouts, so any attempt to use BLT paths with MSAA buffers is likely to break spectacularly. In most cases, BLORP handles MSAA blits, so we never hit this bug. Until recently, it also wasn't worth fixing, since Meta couldn't handle MSAA either, so there was nothing to fall back to. But now there is. +143 piglit tests on Broadwell (which doesn't have BLORP support). Surprisingly, three also start failing. Since non-IMS MSAA buffers store samples in successive array slices, using the blitter ought to access sample 0 and ignore the rest, which is apparently good enough for a few not-very-picky Piglit tests. Presumably the meta replacement code is still broken. No Piglit changes on Ivybridge. v2: Move the early return to the top of the function (suggested by Paul). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* freedreno/a3xx/compiler: half-precision outputRob Clark2014-02-236-10/+130
| | | | | | | | | | | | | | | | | | | | Using generic shaders caused a measurable fps drop, which was isolated to use of full precision (vs half precision) output. This is an attempt to regain that lost performance by using half precision solid/blit shaders (when the output format is not float32). Note: for the built-in shaders, I would not expect them to be register starved. And in fact it is the solid frag shader that seems to have the biggest impact. So I suspect you get double the pixel pipe units (or half the cycles) when the output is half precision. So there may be some gain to using half precision output for application shaders as well, even though the rest of register usage is still full precision. But for half precision to work for more complex shaders, we need to deal with some constraints, like cat2 needing same precision for it's two src registers. So for now it is not enabled by default except for the built-in shaders. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: add shader variantsRob Clark2014-02-2310-196/+283
| | | | | | | | | Start putting in place infrastructure to deal with multiple shader variants. Initially we'll use this for two sided color (frag) and binning pass (vert) shaders. Possibly need for others later (such as YUV vs RGB eglImage?). Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: collapse nop's with repeatRob Clark2014-02-232-0/+15
| | | | | | | | | Easier than making more extensive use of rpt, and the more compact shaders seem to bring some bit of performance boost. (Perhaps repeat flag benefits are more than just instruction cache, possibly it saves on instruction decode as well?) Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: drop hand-coded blit/solid shadersRob Clark2014-02-2310-287/+181
| | | | | | | | | Instead in the common code, construct these shaders from TGSI. For now we let a2xx keep it's hand coded shaders, as it's compiler isn't quite up to the job yet. All the same it is a net drop in code size and gets rid of special cases. Signed-off-by: Rob Clark <[email protected]>
* freedreno/lowering: cleanup apiRob Clark2014-02-235-24/+138
| | | | | | | | Make things configurable, and tweak the API a bit to avoid an extra tgsi_shader_scan(). Getting closer to something generic which can be moved out of freedreno and shaderd by other drivers. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: add float 16 and 32bit formatsRob Clark2014-02-231-0/+22
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: resync generated headersRob Clark2014-02-234-4/+20
| | | | Signed-off-by: Rob Clark <[email protected]>
* glx/drisw: use the implemented version of __DRIswrastLoaderExtensionEmil Velikov2014-02-231-5/+6
| | | | | | | | ... over the one provided by the headers. Explicitly set extension members to improve clarity. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glx/dri: use the implemented version of __DRIdamageExtensionEmil Velikov2014-02-231-2/+3
| | | | | | | | ... over the one provided by the headers. Explicitly set extension members to improve clarity. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glx/dri_common: use the implemented version of __DRIsystemTimeExtensionEmil Velikov2014-02-231-3/+4
| | | | | | | | ... over the one provided by the headers. Explicitly set extension members to improve clarity. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glx/dri: use the implemented version of __DRIgetDrawableInfoExtensionEmil Velikov2014-02-231-2/+3
| | | | | | | ... over the one provided by the headers. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* dri_util: use the implemented version of __DRIimageDriverExtensionEmil Velikov2014-02-231-1/+1
| | | | | | | | | | ... over the one provided by the headers. Currently both versions are identical, but that is not guaranteed to be the case in the future. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glx/dri3: set the implemented version of __DRIimageLoaderExtensionEmil Velikov2014-02-231-3/+4
| | | | | | | | | ... over the one provided by the spec. Currently both versions are identical, but that is not guaranteed to be the case in the future. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* gbm: explicitly set __DRIimageLoaderExtension membersEmil Velikov2014-02-231-3/+4
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* egl/wayland: explicitly set __DRIimageLoaderExtension membersEmil Velikov2014-02-231-3/+4
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>y
* drivers/dri: explicitly set __DRI2flushExtension membersEmil Velikov2014-02-232-6/+8
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>y
* gbm: explicitly set __DRIdri2LoaderExtension membersEmil Velikov2014-02-231-4/+5
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>y
* glx/dri2: set the implemented version of __DRIdri2LoaderExtensionEmil Velikov2014-02-231-8/+10
| | | | | | | | ... over the version number provided by the headers. Explicitly set extension members to improve clarity. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* dri_interface: note introduction of __DRIdri2LoaderExtension membersEmil Velikov2014-02-231-0/+4
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* dri_interface: note introduction of various __DRItexBufferExtension membersEmil Velikov2014-02-231-0/+4
| | | | | | | | | | Note the member function releaseTexBuffer was added without bumping spec version, and currently no drivers implement it. v2: releaseTexBuffer was introduced by version 3 Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* dri_interface: Note the version introducing ↵Emil Velikov2014-02-231-0/+2
| | | | | | | __DRIswrastLoaderExtensionRec::putImage2 Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* dri_util: explicitly set __DRIcopySubBufferExtension membersEmil Velikov2014-02-231-2/+3
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* dri_util: explicitly set __DRIswrastExtension members.Emil Velikov2014-02-231-6/+7
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Pass stdout to _mesa_print_ir from st_glsl_to_tgsi.Kenneth Graunke2014-02-221-1/+1
| | | | | | | Fixes the Gallium build since commit 1e3bd9f9a5af90295788c5d71ea27c. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75389 Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Move the remaining driver debug over to stderr.Eric Anholt2014-02-2212-63/+66
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Move compiler debugging output to stderr.Eric Anholt2014-02-2224-248/+258
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add a file argument to the IR printer.Eric Anholt2014-02-2211-118/+127
| | | | | | | | | | | | While we want to be able to print to stdout for glsl_compiler, for debugging drivers we want to be able to dump to stderr because that's where other driver debug (like LIBGL_DEBUG) tends to go, and because some apps actually close stdout to shut up their own messages (such as the X Server, or NWN). Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Refactor debug dumping of GLSL IR.Eric Anholt2014-02-225-27/+29
| | | | | | | | | This was only going to get worse when tesselation shows up, and was causing too much extra duplication in my stderr changes coming up. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel: Remove some dead code I noticed in intel_screen.c.Eric Anholt2014-02-222-8/+0
| | | | | | | | It was present in the initial i915tex import. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use the object label when available for INTEL_DEBUG=vs,gs,fs output.Eric Anholt2014-02-224-4/+10
| | | | | | | | Note that this requires updated run.py in shader_db. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use the object label when available for shader_time output.Eric Anholt2014-02-221-5/+8
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* meta: Set some object labels on our meta shaders.Eric Anholt2014-02-222-0/+14
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nv50: make sure to clear _all_ layers of all attachmentsIlia Mirkin2014-02-223-3/+21
| | | | | | | | | | | | | | | Unfortunately there's only one RT_ARRAY_MODE setting for all attachments, so clears were previously truncated to the minimum number of layers any attachment had. Instead set the RT_ARRAY_MODE to 512 (the max number of layers) before doing the clear. This fixes gl-3.2-layered-rendering-clear-color-mismatched-layer-count. Also fix clears of individual layered rt/zeta, in case it ever happens. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Christoph Bumiller <[email protected]> Cc: 10.1 <[email protected]>
* ilo: fix and enable fast depth clearChia-I Wu2014-02-222-9/+38
| | | | | | | | Use tex->bo_format instead of zs->format in ilo_blitter_rectlist_clear_zs() because the latter may be combined depth/stencil format. hiz_can_clear_zs() is no-op for GEN7+, but move the GEN check so that the assertions are tested. Finally, call the fast depth clear function from ilo_clear().
* ilo: add slice clear valueChia-I Wu2014-02-225-7/+78
| | | | | It is needed for 3DSTATE_CLEAR_PARAMS, and can also be used to track what value the slice has been cleared to.
* ilo: better readability and doc for texture flagsChia-I Wu2014-02-223-36/+58
| | | | | Improve comments for the flags, and explicitly separate their uses in slice flags and resolve flags.
* ilo: fix for stencil only rectlist opsChia-I Wu2014-02-222-2/+8
| | | | | 3DSTATE_STENCIL_BUFFER inherits some states from 3DSTATE_DEPTH_BUFFER. We need to emit both even the surface is stencil only.
* ilo: fix a false assertion failure on GEN6Chia-I Wu2014-02-221-4/+12
| | | | Layer offsetting is possible when it is level 0, layer 0.
* ilo: pipe_texture::usage is not a bitfieldChia-I Wu2014-02-221-1/+1
| | | | It happens to work because PIPE_USAGE_STAGING is 0x100.
* ilo: set ILO_TEXTURE_CPU_WRITE for imported texturesChia-I Wu2014-02-221-3/+10
| | | | | Assume the bo has been written by another process, which will trigger a HiZ resolve.
* nv50/ir/ra: fix SpillCodeInserter::offsetSlot usageChristoph Bumiller2014-02-221-7/+7
| | | | | | We were turning non-memory spill slots into NULL. Cc: 10.1 <[email protected]>
* Revert "i965/fs: Make fs_reg's type an enum for better debugging."Matt Turner2014-02-214-7/+6
| | | | | | | | This reverts commit 5ceadd29b0af835d741bcf09b9622c628e549ae6. I rebased and apparently failed to build test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75355
* i965/fs: Drop the emit(fs_inst) overload.Kenneth Graunke2014-02-213-25/+17
| | | | | | | | | | | | | | | | | | | | Using this emit function implicitly creates three copies, which is pointlessly inefficient. 1. Code creates the original instruction. 2. Calling emit(fs_inst) copies it into the function. 3. It then allocates a new fs_inst and copies it into that. The second could be eliminated by changing the signature to fs_inst(const fs_inst &) but that wouldn't eliminate the third. Making callers heap allocate the instruction and call emit(fs_inst *) allows us to just use the original one, with no extra copies, and isn't much more of a burden. Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>