summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* gallium/docs: update SLT, SGE, SFL, STR opcode docsBrian Paul2014-03-181-10/+10
| | | | | | | To emphasize that the result is floating point 1.0 or 0.0, to match other opcodes like SLE and SEQ. Reviewed-by: Roland Scheidegger <[email protected]>
* nvc0: Handle user mapped vertex buffer for edgeflagMaarten Lankhorst2014-03-181-2/+7
| | | | | | | Handle mapping edgeflag data similar to the code around it. This fixes a crash in piglit test gl-2.0-edgeflag. Signed-off-by: Maarten Lankhorst <[email protected]>
* clover: Fix region size error checking in some buffer transfer commands.Francisco Jerez2014-03-181-5/+16
| | | | Tested-by: Tom Stellard <[email protected]>
* nv50/ir/gk110: add postfactor support for fmulIlia Mirkin2014-03-181-0/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: set not modifier on first source of logic opIlia Mirkin2014-03-181-3/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: use shl/shr instead of lshf/rshf so that c[] is supportedIlia Mirkin2014-03-181-17/+6
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: add 64/128-bit fetch/export supportIlia Mirkin2014-03-182-7/+4
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: fix handling of OP_SUB for floating point opsIlia Mirkin2014-03-181-1/+6
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: presin/preex2 take their source at bit 23Ilia Mirkin2014-03-181-1/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: add implementations of div u32/s32Ilia Mirkin2014-03-182-5/+162
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: implement quadopIlia Mirkin2014-03-181-1/+11
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: fill in mov from predicateIlia Mirkin2014-03-181-1/+5
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: handle derivAll flag, fix useOffsets for non-txfIlia Mirkin2014-03-181-4/+8
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: fix setting texture for txd/txf/txqIlia Mirkin2014-03-181-9/+8
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: add texcsaa implementationIlia Mirkin2014-03-181-1/+11
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: add pfetch supportIlia Mirkin2014-03-181-1/+9
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: add emit/restart implementationsIlia Mirkin2014-03-181-1/+8
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: add missing break in sched emitIlia Mirkin2014-03-181-1/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: implement partial txq supportIlia Mirkin2014-03-181-1/+27
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: fill out texture instruction supportIlia Mirkin2014-03-181-13/+20
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: fix control flow opcode emission, add sat flagIlia Mirkin2014-03-181-22/+18
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* egl/main: Stop using EGLNative types internallyChad Versace2014-03-171-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Internally, much of the EGL code uses EGLNativeDisplayType, EGLNativeWindowType, and EGLPixmapType. However, the EGLNative type often does not match the variable's actual type. The concept of EGLNative types are a bad match for Linux, as explained below. And the EGL platform extensions don't use EGLNative types at all. Those extensions attempt to solve cross-platform issues by moving the EGL API away from the EGLNative types. The core of the problem is that eglplatform.h can define each EGLNative type once only, but Linux supports multiple EGL platforms. To work around the problem, Mesa's eglplatform.h contains multiple definitions of each EGLNative type, selected by feature macros. Mesa expects EGL clients to set the feature macro approrpiately. But the feature macros don't work when a single codebase must be built with support for multiple EGL platforms, *such as Mesa itself*. When building libEGL, autotools chooses the EGLNative typedefs based on the first element of '--with-egl-platforms'. For example, '--with-egl-platforms=x11,drm,wayland' defines the following: typedef Display* EGLNativeDisplayType; typedef Window EGLNativeWindowType; typedef Pixmap EGLNativePixmapType; Clearly, this doesn't work well for Wayland and GBM. Mesa works around the problem by casting the EGLNative types to different things in different files. For sanity's sake, and to prepare for the EGL platform extensions, this patch removes from egl/main and egl/dri2 all internal use of the EGLNative types. It replaces them with 'void*' and checks each explicit cast with a static assertion. Also, the patch touches egl_gallium the minimal amount to keep it compatible with eglapi.h. Signed-off-by: Chad Versace <[email protected]>
* winsys/radeon: Store GPU virtual memory addresses of BOs in a hash tableMichel Dänzer2014-03-171-48/+26
| | | | | | | | | | | | | This allows retrieving the existing BO and incrementing its reference count, instead of creating a separate winsys representation for it, when the kernel reports that the BO was already assigned a virtual memory address. This fixes problems with XWayland using radeonsi and the xf86-video-wlglamor driver, which calls GEM flink outside of the radeon winsys code and creates BOs from the flinked names using the same DRM file descriptor. Reviewed-by: Marek Olšák <[email protected]>
* targets/dri-ilo: make the driver installableChia-I Wu2014-03-161-4/+3
| | | | | | | | | | | | | | install-gallium-links.mk fails to create the compat link for ilo_dri.so because it looks for dri_LTLIBRARIES instead of noinst_LTLIBRARIES. Fix this by switching to dri_LTLIBRARIES (and make the driver installable). Since pci_id_driver_map.h and the DDX both tell libGL.so to look for "i965", ilo_dri.so will never be loaded even enabled and installed. The change should not create any more confusion. Signed-off-by: Chia-I Wu <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* radeonsi/compute: Fix memory leakAaron Watry2014-03-151-0/+6
| | | | | | Free shader buffer object for all kernels when deleting compute state. Signed-off-by: Aaron Watry <[email protected]>
* gallivm: optimize repeat linear npot code in the aos int pathJeff Muizelaar2014-03-141-12/+62
| | | | | | Similar to the other cases, shift some weight/coord calculations to int space. This should be slightly faster (on x86 sse it should actually safe one instruction, and generally int instructions are cheaper).
* gallivm: use correct rounding for nearest wrap mode (in the aos int path)Roland Scheidegger2014-03-141-29/+9
| | | | | | | | | | | | | The previous code used coords which were calculated as (int) (f_coord * tex_size * 256) >> 8. This is not only unnecessarily complex but can give the wrong texel due to rounding for negative coords (as an example, after denormalization coords from -1.0 to 0.0 should give -1, but this will give -1 for numbers from -1.0-1/256 - 0.0-1/256. Instead, juse use ifloor, dropping the shift stuff. Unfortunately, this will most likely be slower - with arch rounding available it shouldn't be too bad (trades a int shift for a round but also saves an int mul (which is shared by all coords) but otherwise it's a mess.
* gallivm: use correct rounding for linear wrap mode (in the aos int path)Jeff Muizelaar2014-03-141-6/+8
| | | | | | | | | | | | | | | | | | | The previous method for converting coords to ints was sligthly inaccurate (effectively losing 1bit from the 8bit lerp weight). This is probably especially noticeable when trying to draw a pixel-aligned texture. As an example, for a 100x100 texture after dernormalization the texture coords in this case would turn up as 0.5, 1.5, 2.5, 3.5, 4.5, ... After the mul by 256, conversion to int and 128 subtraction, they end up as 0, 256, 512, 768, 1024, ... which gets us the correct coords/weights of 0/0, 1/0, 2/0, 3/0, 4/0, ... But even LSB errors (which are unavoidable) in the input coords may cause these coords/weights to be wrong, e.g. for a coord of 3.49999 we'd get a coord/weight of 2/255 instead. Fix this by using round-to-nearest int instead of FPToSi (trunc). Should be equally fast on x86 sse though other archs probably suffer a little.
* radeonsi: flush the dma ring in si_flush_from_stNiels Ole Salscheider2014-03-141-0/+7
| | | | | Signed-off-by: Niels Ole Salscheider <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeon: Move DMA ring creation to common codeNiels Ole Salscheider2014-03-144-31/+32
| | | | | Signed-off-by: Niels Ole Salscheider <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* nvc0: minor cleanups in stream output handlingEmil Velikov2014-03-141-4/+5
| | | | | | | | | | Constify the offsets parameter to silence gcc warning 'assignment from incompatible pointer type' due to function prototype miss-match. Use a boolean changed as a shorthand for target != current_target. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: honor fread return value in the nouveau_compilerEmil Velikov2014-03-141-2/+2
| | | | | | | | | | There is little point of continuing if fread returns zero, as it indicates that either the file is empty or cannot be read from. Bail out if fread returns zero after closing the file. Cc: Ilia Mirkin <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: typecast the prime_fd handle when calling nouveau_bo_set_primeEmil Velikov2014-03-141-1/+1
| | | | | | | | Core drm defines that the handle is of type int, while all drivers treat it as uint internally. Typecast the value to silence gcc warning messages and be consistent amongst all drivers. Signed-off-by: Emil Velikov <[email protected]>
* nv50: add missing brackets when handling the samplers arrayEmil Velikov2014-03-141-1/+2
| | | | | | | | | | | | | | | | | | Commit 3805a864b1d(nv50: assert before trying to out-of-bounds access samplers) introduced a series of asserts as a precausion of a previous illegal memory access. Although it failed to encapsulate loop within nv50_sampler_state_delete effectively failing to clear the sampler state, apart from exadurating the illegal memory access issue. Fixes gcc warning "array subscript is above array bounds" and "Nesting level does not match indentation" and "Out-of-bounds read" defects reported by Coverity. Cc: "10.1" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* r600g: compute memory pool size is given in dwNiels Ole Salscheider2014-03-111-2/+2
| | | | | | | Multiply the dw value by 4 in order to map the complete buffer. Reviewed-by: Tom Stellard <[email protected]> Signed-off-by: Niels Ole Salscheider <[email protected]>
* r600g,radeonsi: attempt to fix racy multi-context apps calling BufferDataMarek Olšák2014-03-113-14/+18
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75061 v2: minimize the window where cs_buf != new_buf
* r600g,radeonsi: fix broken buffer downloadMarek Olšák2014-03-111-1/+1
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: use a fallback in dma_copy instead of failingMarek Olšák2014-03-116-97/+99
| | | | | | v2: - allow byte-aligned DMA buffer copies on Evergreen - fix piglit/texsubimage regression - use the fallback for 3D copies (depth > 1) as well
* radeonsi: small cleanup in get_paramMarek Olšák2014-03-111-4/+2
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: set correct alignment for texture buffers and constant buffersMarek Olšák2014-03-111-3/+2
| | | | | | | I think these are all equivalent to vertex buffer fetches which should be dword-aligned. Scalar loads are also dword-aligned. Reviewed-by: Michel Dänzer <[email protected]>
* r600g, radeonsi: fix primitives-generated query with disabled streamoutMarek Olšák2014-03-1111-49/+87
| | | | | | | | | | | | | | | | | Buffers are disabled by VGT_STRMOUT_BUFFER_CONFIG, but the query only works if VGT_STRMOUT_CONFIG.STREAMOUT_0_EN is enabled. This moves VGT_STRMOUT_CONFIG to its own state. The register is set to 1 if either streamout or the primitives-generated query is enabled. However, the primitives-emitted query is also incremented, so it's disabled by setting VGT_STRMOUT_BUFFER_SIZE to 0 when there is no buffer bound. This fixes piglit: ARB_transform_feedback2/counting with pause EXT_transform_feedback/primgen-query transform-feedback-disabled Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: don't add streamout.num_dw_for_end twiceMarek Olšák2014-03-111-2/+4
| | | | | | | | It's already added in need_cs_space. Also don't calculate anything if there are no buffers. Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: fix MAX_TEXTURE_3D_LEVELS and MAX_TEXTURE_ARRAY_LAYERS limitsMarek Olšák2014-03-112-6/+11
| | | | | | | | | CB_COLORi_VIEW.SLICE_MAX can be at most 2047. This fixes the maxlayers piglit test. Cc: [email protected] Reviewed-by: Michel Dänzer <[email protected]>
* st/dri: flush drawable textures before unreferencingMarek Olšák2014-03-111-0/+8
| | | | | | This fixes piglit/fbo-sys-blit with fast clear on radeonsi. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement fast color clearMarek Olšák2014-03-115-4/+59
| | | | | | This works for both multi-sample and single-sample color buffers. Reviewed-by: Michel Dänzer <[email protected]>
* r600g: move fast color clear code to a common placeMarek Olšák2014-03-113-84/+88
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: move CMASK register values from r600_surface to r600_textureMarek Olšák2014-03-116-61/+48
| | | | | | | | | | | When doing fast clear for single-sample color buffers for the first time, a CMASK buffer has to be allocated and the CMASK state in all pipe_surfaces referencing the color buffer must be updated. Updating all surfaces is kinda silly, so let's move the values to r600_texture instead. This is only for Evergreen and later. R600-R700 don't have fast clear. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: convert the framebuffer state to atom-basedMarek Olšák2014-03-115-283/+132
| | | | | | | | | | | This looks like r600g. The shared Cayman MSAA code is used here. The real motivation for this is that I need the ability to change values of color registers after the framebuffer state is set. The PM4 state cannot be modified easily after it's generated. With this, I can just change r600_surface::cb_color_xxx and set framebuffer.atom.dirty=true and it's done. Reviewed-by: Michel Dänzer <[email protected]>
* r600g: move cayman MSAA setup to a common placeMarek Olšák2014-03-116-214/+272
| | | | | | I will use this in radeonsi. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: move framebuffer-related state to a new struct si_framebufferMarek Olšák2014-03-115-39/+41
| | | | Reviewed-by: Michel Dänzer <[email protected]>