aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* radeon/uvd: fix overflow error while calculating bit stream buffer sizeIndrajit Das2016-07-041-1/+1
| | | | Reviewed-by: Christian König <[email protected]>
* i965/blorp: Prepare for more than two vertex attributesTopi Pohjolainen2016-07-044-3/+22
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/blorp: Tell vertex fetcher about flat inputsTopi Pohjolainen2016-07-042-8/+30
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/blorp: Add support for flat input bufferTopi Pohjolainen2016-07-041-3/+65
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/blorp: Store input read maskTopi Pohjolainen2016-07-042-0/+2
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/blorp: Rename push constants to inputsTopi Pohjolainen2016-07-045-22/+22
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/blorp: Use core vertex buffer state setupTopi Pohjolainen2016-07-041-48/+14
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/blorp: Split vertex data and element setupTopi Pohjolainen2016-07-041-21/+25
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Unify vertex buffer setupTopi Pohjolainen2016-07-042-29/+46
| | | | | | | | On gen >= 8 one doesn't provide ending address but number of bytes available. This is relative to the given offset. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/draw: Expose vertex buffer state setupTopi Pohjolainen2016-07-042-18/+37
| | | | | | | Also change the interface to use start and end offsets. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* freedreno: fix crash on smaller gpus and higher resolutionsRob Clark2016-07-031-1/+1
| | | | | | | | | Devices with smaller GMEM size need more tiles. On db410c at 2048x1152, glmark2 shadow needed ~330 tiles for fullscreen. Lets bump it up to 512. (Maybe with MRT you could end up needing more, but at that point things are probably going to be painfully slow.) Signed-off-by: Rob Clark <[email protected]>
* i965: don't drop const initializers in vector splittingRob Clark2016-07-021-0/+12
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: add driconf to zero-init unintialized varsRob Clark2016-07-0211-1/+34
| | | | | | | | | | | | | Some games are sloppy.. perhaps because it is defined behavior for DX or perhaps because nv blob driver defaults things to zero. So add driconf param to force uninitialized variables to default to zero. This issue was observed with rust, from steam store. But has surfaced elsewhere in the past. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/ir3: support glsl linking for cmdline compilerRob Clark2016-07-021-24/+47
| | | | | | | | | | | For .vert/.frag, now multiple can be specified on the cmdline for purposes of linking, and the last one specified is the one that is fed into the ir3 backend (and dumped along the way if --verbose is specified) Without this, varyings in frag shaders would appear as undefined. Signed-off-by: Rob Clark <[email protected]>
* glsl/standalone: initialize MaxUserAssignableUniformLocationsRob Clark2016-07-021-0/+4
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update valid_buffer_range for SO buffersRob Clark2016-07-021-0/+5
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: support non-user_buffer constsRob Clark2016-07-022-3/+5
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a2xx: move setup/restore cmds into binning passRob Clark2016-07-024-9/+4
| | | | | | | | Rather than doing a separate submit at context create, move these cmds to before first tile, as is done on a3xx/a4xx. Otherwise state can be overwritten by other contexts. Signed-off-by: Rob Clark <[email protected]>
* freedreno: pass index buffer as a pipe_resourceRob Clark2016-07-022-16/+16
| | | | | | This will be useful in a following patch. Signed-off-by: Rob Clark <[email protected]>
* freedreno: switch emit_const_bo() to take prsc'sRob Clark2016-07-024-17/+18
| | | | | | We can push the unwrap of pipe_resource down. Signed-off-by: Rob Clark <[email protected]>
* nv30: Fix "array subscript is below array bounds" compiler warningHans de Goede2016-07-021-2/+1
| | | | | | | | gcc6 does not like the trick where we point to one entry before the array start and then start a while with a pre-increment. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: Fix a couple of "foo may be used uninitialized' compiler warningsHans de Goede2016-07-022-3/+3
| | | | | | | | | | | | | These are all new false positives with gcc6. In nouveau_compiler.c: gcc6 no longer assumes that passing a pointer to a variable into a function initialises that variable. In nv50_ir_from_tgsi.cpp op and mode are not set if there are 0 enabled dst channels, this never happens, but gcc cannot know this. Signed-off-by: Hans de Goede <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nouveau: Fix gcc6 / c++11 auto_ptr deprecation compiler warningsHans de Goede2016-07-021-0/+4
| | | | | Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nouveau: Add support for SV_WORK_DIMHans de Goede2016-07-028-12/+29
| | | | | | | | Add support for SV_WORK_DIM for nvc0 and nve4. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: Make NVC0_CB_AUX_GRID_INFO take an index argumentHans de Goede2016-07-023-4/+4
| | | | | | | | | This brings it inline with the other macros like NVC0_CB_AUX_UBO_INFO and NVC0_CB_AUX_TEX_INFO. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* clover: Pass work_dim parameter of clEnqueueNDRangeKernel() to driverHans de Goede2016-07-022-0/+8
| | | | | | | | | In order to implement get_work_dim() the driver may need to know the clEnqueueNDRangeKernel() work_dim parameter, so pass it to the driver. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* tgsi: Add WORK_DIM System ValueHans de Goede2016-07-023-0/+10
| | | | | | | | | | | Add a new WORK_DIM SV type, this is will return the grid dimensions (1-4) for compute (opencl) kernels. This is necessary to implement the opencl get_work_dim() function. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* mesa/main: fix error checking logic on CopyImageSubDataAlejandro Piñeiro2016-07-021-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the case (both src or dst) where we had a texobject, but the texobject target was not the same that the method target, this spec paragraph was appplied: /* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core * Profile spec says: * * "An INVALID_VALUE error is generated if either name does not * correspond to a valid renderbuffer or texture object according * to the corresponding target parameter." */ But for that case, the correct spec paragraph should be: /* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core * Profile spec says: * * "An INVALID_ENUM error is generated if either target is * not RENDERBUFFER or a valid non-proxy texture target; * is TEXTURE_BUFFER or one of the cubemap face selectors * described in table 8.18; or if the target does not * match the type of the object." */ specifically the last sentence: "or if the target does not match the type of the object". This patch fixes the error returned (s/INVALID/ENUM) for that case, and moves up the INVALID_VALUE spec paragraph, as that case (invalid texture object) was handled before. Fixes: GL44-CTS.copy_image.target_miss_match Reviewed-by: Nicolai Hähnle <[email protected]>
* st/glsl_to_tgsi: don't increase immediate index by 1.Dave Airlie2016-07-021-1/+1
| | | | | | | | | Immediates are stored into a separate table, and are consolidated, so if we get an immediate we don't need to offset it as the index it has is correct. Cc: "11.2 12.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* st/mesa: get max supported number of image samples from driverIlia Mirkin2016-07-011-1/+5
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0: fix up image support for allowing multiple samplesIlia Mirkin2016-07-017-49/+108
| | | | | | | | | Basically we just have to scale up the coordinates and then add the relevant sample offset. The code to handle this was already largely present from Christoph's earlier attempts to pipe images through back in the dark ages, this just hooks it all up. Signed-off-by: Ilia Mirkin <[email protected]>
* st/mesa: check the texture image level in st_texture_match_imageNicolai Hähnle2016-07-011-0/+3
| | | | | | | | Otherwise, 1x1 images of arbitrarily high level are accepted. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96639#add_comment Cc: 11.2 12.0 <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* st/mesa: an incomplete texture may have a zero-size first imageNicolai Hähnle2016-07-011-0/+3
| | | | | | | | | | | | | | | Fixes a regression introduced by commit 42624ea83 which triggered an assertion in dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0 While stImage must have a non-zero size as verified by the caller, we also look at the size of the base image in an attempt to make a better guess at the level0 size (this is important when the base image size is odd). However, the base image may have a zero size even when it exists. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96629 Cc: 12.0 <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* st/vdpau: use bicubic filter for scaling(v6.1)Nayan Deshmukh2016-07-013-14/+106
| | | | | | | | | | | | | | | | | | use bicubic filtering as high quality scaling L1. v2: fix a typo and add a newline to code v3: -render the unscaled image on a temporary surface (Christian) -apply noise reduction and sharpness filter on unscaled surface -render the final scaled surface using bicubic interpolation v4: support high quality scaling v5: set dst_area and dst_clip in bicubic filter v6: set buffer layer before setting dst_area v6.1: add PIPE_BIND_LINEAR when creating resource Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* vl: add a bicubic interpolation filter(v5)Nayan Deshmukh2016-07-013-0/+528
| | | | | | | | | | | | | | | This is a shader based bicubic interpolater which uses cubic Hermite spline algorithm. v2: set dst_area and dst_clip during scaling (Christian) v3: clear the render target before rendering v4: intialize offsets while initializing shaders use a constant buffer to send dst_size to frag shader small changes to reduce calculation in shader v5: send half pixel offset instead of sending dst_size Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* mesa/st: Use 'struct nir_shader' instead of 'nir_shader'.Vinson Lee2016-07-011-6/+6
| | | | | | | | | | | | | | | Fix this build error with GCC 4.4. CC state_tracker/st_nir_lower_builtin.lo In file included from state_tracker/st_nir_lower_builtin.c:61: state_tracker/st_nir.h:34: error: redefinition of typedef ‘nir_shader’ ../../src/compiler/nir/nir.h:1830: note: previous declaration of ‘nir_shader’ was here Suggested-by: Rob Clark <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96235 Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* docs: update MESA_DEBUG envvar documentation.Alejandro Piñeiro2016-07-011-2/+11
| | | | | | | | | | | | | | | | silent, flush, incomplete_tex and incomplete_fbo flags were not documented (see src/mesa/main.debug.c for more info). FP is not checked anymore. v2 (Brian Paul): * MESA_DEBUG accepts a comma-separated list of parameters. * Clarify how MESA_DEBUG behaves with mesa debug and release builds. * Updated wording. v3: Better wording for one paragraph (Brian Paul) Reviewed-by: Brian Paul <[email protected]>
* i965: intel_texture_barrier reimplementedAlejandro Piñeiro2016-07-011-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes: GL44-CTS.texture_barrier_ARB.same-texel-rw-multipass On Haswell, Broadwell and Skylake (note that in order to execute that test, it is needed to override GL and GLSL versions). On gen6 this test was already working without this change. It keeps working after it. This commit replaces the call to brw_emit_mi_flush for gen6+ with two calls to brw_emit_pipe_control_flush: * The first one with RENDER_TARGET_FLUSH and CS_STALL set to initiate a render cache flush after any concurrent rendering completes and cause the CS to stop parsing commands until the render cache becomes coherent with memory. * The second one have TEXTURE_CACHE_INVALIDATE set (and no CS stall) to clean up any stale data from the sampler caches before rendering continues. Didn't touch gen4-5, basically because I don't have a way to test them. More info on commits: 0aa4f99f562a05880a779707cbcd46be459863bf 72473658c51d5e074ce219c1e6385a4cce29f467 Thanks to Curro to help to tracking this down, as the root case was a hw race condition. v2: use two calls to pipe_control_flush instead of a combination of gen7_emit_cs_stall_flush and brw_emit_mi_flush calls (Curro) v3: no need to const cache invalidation (Curro) Reviewed-by: Francisco Jerez <[email protected]>
* nv30: go back to not using viewport validate function for swtnlIlia Mirkin2016-07-012-1/+16
| | | | | | | | | The output of draw requires a null viewport transform, which the regular code is ill-equiped to do. Reinstate the original settings in the render path, and add setting of the viewport clip polygon based on fb width/height (as that is all taken care of by draw). Signed-off-by: Ilia Mirkin <[email protected]>
* nv30: fix viewport clipping settings to be based on viewport, not rtIlia Mirkin2016-07-012-17/+11
| | | | | | | This fixes a ton of "*clip*" dEQP GLES2 tests, as well as triangle-guardband-viewport in piglit. Signed-off-by: Ilia Mirkin <[email protected]>
* gallium/util: check for window cliprects in util_can_blit_via_copy_region()Brian Paul2016-06-301-0/+1
| | | | | | We can't blit with resource_copy_region() if there are window clip rects. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: Force blend color to 16-byte alignmentChuck Atkins2016-06-301-1/+11
| | | | | | | | | | | | | | | This aligns the 4-element color float array to 16 byte boundaries. This should allow compiler vectorizers to generate better optimizations. Also fixes broken vectorization generated by Intel compiler. v2: Fixed indentation and added a lengthy comment explaining the reason for the alignment. Cc: <[email protected]> Reported-by: Tim Rowley <[email protected]> Tested-by: Tim Rowley <[email protected]> Signed-off-by: Chuck Atkins <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* swr: Refactor checks for compiler feature flagsChuck Atkins2016-06-302-25/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | Encapsulate the test for which flags are needed to get a compiler to support certain features. Along with this, give various options to try for AVX and AVX2 support. Ideally we want to use specific instruction set feature flags, like -mavx2 for instance instead of -march=haswell, but the flags required for certain compilers are different. This allows, for AVX2 for instance, GCC to use -mavx2 -mfma -mbmi2 -mf16c while the Intel compiler which doesn't support those flags can fall back to using -march=core-avx2. This addresses a bug where the Intel compiler will silently ignore the AVX2 instruction feature flags and then potentially fail to build. v2: Pass preprocessor-check argument as true-state instead of false-state for clarity. v3: Reduce AVX2 define test to just __AVX2__. Additional defines suchas __FMA__, __BMI2__, and __F16C__ appear to be inconsistently defined w.r.t thier availability. v4: Fix C++11 flags being added globally and add more logic to swr_require_cxx_feature_flags Cc: <[email protected]> Reviewed-by: Tim Rowley <[email protected]> Tested-by: Tim Rowley <[email protected]> Signed-off-by: Chuck Atkins <[email protected]>
* st/wgl: make own_mutex() non-staticBrian Paul2016-06-302-4/+7
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* glsl: atomic counters are different than their uniformsAndres Gomez2016-06-301-37/+37
| | | | | | | | | | The linker deals with atomic counters in terms of uniforms but the data structure are called after the atomic counters. Renamed the data structures used in the linker for disambiguation. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Andres Gomez <[email protected]>
* glsl: count atomic counters correctlyAndres Gomez2016-06-301-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the linker uses the uniform count for the total number of atomic counters. However uniforms don't include the innermost array dimension in their count, but atomic counters are expected to include them. Although the spec doesn't directly state this, it's clear how offsets will be assigned for arrays. From OpenGL 4.2 (Core Profile), page 98: " * Arrays of type atomic_uint are stored in memory by element order, with array element member zero at the lowest offset. The difference in offsets between each pair of elements in the array in basic machine units is referred to as the array stride, and is constant across the entire array. The stride can be queried by calling GetIntegerv with a pname of ATOMIC_COUNTER_- ARRAY_STRIDE after a program is linked." From that it is clear how arrays of atomic counters will interact with GL_MAX_ATOMIC_COUNTER_BUFFER_SIZE. For other kinds of uniforms it's also clear that each entry in an array counts against the relevant limits. Hence, although inferred, this is the expected behavior. Fixes GL44-CTS.arrays_of_arrays_gl.AtomicDeclaration Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Andres Gomez <[email protected]>
* svga: use SVGA3D_vgpu10_BufferCopy() for buffer copiesBrian Paul2016-06-301-4/+28
| | | | | | | | | | So that we do copies host-side rather than in the guest with map/memcpy. Tested with piglit arb_copy_buffer-subdata-sync test and new arb_copy_buffer-intra-buffer-copy test. Reviewed-by: Charmaine Lee <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* svga: add SVGA3D_vgpu10_BufferCopy()Brian Paul2016-06-302-0/+30
| | | | | Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: flush buffers when mapping for readingBrian Paul2016-06-301-13/+24
| | | | | | | | | | | | | | | With host-side buffer copies (via SVGA3D_vgpu10_BufferCopy()) we have to make sure any pending map-write operations are completed before reading if the buffer is dirty. Otherwise the ReadbackSubResource operation could get stale data from the host buffer. This allows the piglit arb_copy_buffer-subdata-sync test to pass when we start using the SVGA3D_vgpu10_BufferCopy command. v2: check the sbuf->dirty flag in the outer conditional, per Charmaine. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: enable ARB_copy_image extension in the driverNeha Bhende2016-06-301-1/+2
| | | | | | Reviewed-by: Brian Paul <[email protected]> Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>