summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965: don't drop const initializers in vector splittingRob Clark2016-07-021-0/+12
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: add driconf to zero-init unintialized varsRob Clark2016-07-0211-1/+34
| | | | | | | | | | | | | Some games are sloppy.. perhaps because it is defined behavior for DX or perhaps because nv blob driver defaults things to zero. So add driconf param to force uninitialized variables to default to zero. This issue was observed with rust, from steam store. But has surfaced elsewhere in the past. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/ir3: support glsl linking for cmdline compilerRob Clark2016-07-021-24/+47
| | | | | | | | | | | For .vert/.frag, now multiple can be specified on the cmdline for purposes of linking, and the last one specified is the one that is fed into the ir3 backend (and dumped along the way if --verbose is specified) Without this, varyings in frag shaders would appear as undefined. Signed-off-by: Rob Clark <[email protected]>
* glsl/standalone: initialize MaxUserAssignableUniformLocationsRob Clark2016-07-021-0/+4
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update valid_buffer_range for SO buffersRob Clark2016-07-021-0/+5
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: support non-user_buffer constsRob Clark2016-07-022-3/+5
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a2xx: move setup/restore cmds into binning passRob Clark2016-07-024-9/+4
| | | | | | | | Rather than doing a separate submit at context create, move these cmds to before first tile, as is done on a3xx/a4xx. Otherwise state can be overwritten by other contexts. Signed-off-by: Rob Clark <[email protected]>
* freedreno: pass index buffer as a pipe_resourceRob Clark2016-07-022-16/+16
| | | | | | This will be useful in a following patch. Signed-off-by: Rob Clark <[email protected]>
* freedreno: switch emit_const_bo() to take prsc'sRob Clark2016-07-024-17/+18
| | | | | | We can push the unwrap of pipe_resource down. Signed-off-by: Rob Clark <[email protected]>
* nv30: Fix "array subscript is below array bounds" compiler warningHans de Goede2016-07-021-2/+1
| | | | | | | | gcc6 does not like the trick where we point to one entry before the array start and then start a while with a pre-increment. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: Fix a couple of "foo may be used uninitialized' compiler warningsHans de Goede2016-07-022-3/+3
| | | | | | | | | | | | | These are all new false positives with gcc6. In nouveau_compiler.c: gcc6 no longer assumes that passing a pointer to a variable into a function initialises that variable. In nv50_ir_from_tgsi.cpp op and mode are not set if there are 0 enabled dst channels, this never happens, but gcc cannot know this. Signed-off-by: Hans de Goede <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nouveau: Fix gcc6 / c++11 auto_ptr deprecation compiler warningsHans de Goede2016-07-021-0/+4
| | | | | Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nouveau: Add support for SV_WORK_DIMHans de Goede2016-07-028-12/+29
| | | | | | | | Add support for SV_WORK_DIM for nvc0 and nve4. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: Make NVC0_CB_AUX_GRID_INFO take an index argumentHans de Goede2016-07-023-4/+4
| | | | | | | | | This brings it inline with the other macros like NVC0_CB_AUX_UBO_INFO and NVC0_CB_AUX_TEX_INFO. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* clover: Pass work_dim parameter of clEnqueueNDRangeKernel() to driverHans de Goede2016-07-022-0/+8
| | | | | | | | | In order to implement get_work_dim() the driver may need to know the clEnqueueNDRangeKernel() work_dim parameter, so pass it to the driver. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* tgsi: Add WORK_DIM System ValueHans de Goede2016-07-023-0/+10
| | | | | | | | | | | Add a new WORK_DIM SV type, this is will return the grid dimensions (1-4) for compute (opencl) kernels. This is necessary to implement the opencl get_work_dim() function. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* mesa/main: fix error checking logic on CopyImageSubDataAlejandro Piñeiro2016-07-021-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the case (both src or dst) where we had a texobject, but the texobject target was not the same that the method target, this spec paragraph was appplied: /* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core * Profile spec says: * * "An INVALID_VALUE error is generated if either name does not * correspond to a valid renderbuffer or texture object according * to the corresponding target parameter." */ But for that case, the correct spec paragraph should be: /* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core * Profile spec says: * * "An INVALID_ENUM error is generated if either target is * not RENDERBUFFER or a valid non-proxy texture target; * is TEXTURE_BUFFER or one of the cubemap face selectors * described in table 8.18; or if the target does not * match the type of the object." */ specifically the last sentence: "or if the target does not match the type of the object". This patch fixes the error returned (s/INVALID/ENUM) for that case, and moves up the INVALID_VALUE spec paragraph, as that case (invalid texture object) was handled before. Fixes: GL44-CTS.copy_image.target_miss_match Reviewed-by: Nicolai Hähnle <[email protected]>
* st/glsl_to_tgsi: don't increase immediate index by 1.Dave Airlie2016-07-021-1/+1
| | | | | | | | | Immediates are stored into a separate table, and are consolidated, so if we get an immediate we don't need to offset it as the index it has is correct. Cc: "11.2 12.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* st/mesa: get max supported number of image samples from driverIlia Mirkin2016-07-011-1/+5
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0: fix up image support for allowing multiple samplesIlia Mirkin2016-07-017-49/+108
| | | | | | | | | Basically we just have to scale up the coordinates and then add the relevant sample offset. The code to handle this was already largely present from Christoph's earlier attempts to pipe images through back in the dark ages, this just hooks it all up. Signed-off-by: Ilia Mirkin <[email protected]>
* st/mesa: check the texture image level in st_texture_match_imageNicolai Hähnle2016-07-011-0/+3
| | | | | | | | Otherwise, 1x1 images of arbitrarily high level are accepted. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96639#add_comment Cc: 11.2 12.0 <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* st/mesa: an incomplete texture may have a zero-size first imageNicolai Hähnle2016-07-011-0/+3
| | | | | | | | | | | | | | | Fixes a regression introduced by commit 42624ea83 which triggered an assertion in dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0 While stImage must have a non-zero size as verified by the caller, we also look at the size of the base image in an attempt to make a better guess at the level0 size (this is important when the base image size is odd). However, the base image may have a zero size even when it exists. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96629 Cc: 12.0 <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* st/vdpau: use bicubic filter for scaling(v6.1)Nayan Deshmukh2016-07-013-14/+106
| | | | | | | | | | | | | | | | | | use bicubic filtering as high quality scaling L1. v2: fix a typo and add a newline to code v3: -render the unscaled image on a temporary surface (Christian) -apply noise reduction and sharpness filter on unscaled surface -render the final scaled surface using bicubic interpolation v4: support high quality scaling v5: set dst_area and dst_clip in bicubic filter v6: set buffer layer before setting dst_area v6.1: add PIPE_BIND_LINEAR when creating resource Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* vl: add a bicubic interpolation filter(v5)Nayan Deshmukh2016-07-013-0/+528
| | | | | | | | | | | | | | | This is a shader based bicubic interpolater which uses cubic Hermite spline algorithm. v2: set dst_area and dst_clip during scaling (Christian) v3: clear the render target before rendering v4: intialize offsets while initializing shaders use a constant buffer to send dst_size to frag shader small changes to reduce calculation in shader v5: send half pixel offset instead of sending dst_size Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* mesa/st: Use 'struct nir_shader' instead of 'nir_shader'.Vinson Lee2016-07-011-6/+6
| | | | | | | | | | | | | | | Fix this build error with GCC 4.4. CC state_tracker/st_nir_lower_builtin.lo In file included from state_tracker/st_nir_lower_builtin.c:61: state_tracker/st_nir.h:34: error: redefinition of typedef ‘nir_shader’ ../../src/compiler/nir/nir.h:1830: note: previous declaration of ‘nir_shader’ was here Suggested-by: Rob Clark <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96235 Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* i965: intel_texture_barrier reimplementedAlejandro Piñeiro2016-07-011-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes: GL44-CTS.texture_barrier_ARB.same-texel-rw-multipass On Haswell, Broadwell and Skylake (note that in order to execute that test, it is needed to override GL and GLSL versions). On gen6 this test was already working without this change. It keeps working after it. This commit replaces the call to brw_emit_mi_flush for gen6+ with two calls to brw_emit_pipe_control_flush: * The first one with RENDER_TARGET_FLUSH and CS_STALL set to initiate a render cache flush after any concurrent rendering completes and cause the CS to stop parsing commands until the render cache becomes coherent with memory. * The second one have TEXTURE_CACHE_INVALIDATE set (and no CS stall) to clean up any stale data from the sampler caches before rendering continues. Didn't touch gen4-5, basically because I don't have a way to test them. More info on commits: 0aa4f99f562a05880a779707cbcd46be459863bf 72473658c51d5e074ce219c1e6385a4cce29f467 Thanks to Curro to help to tracking this down, as the root case was a hw race condition. v2: use two calls to pipe_control_flush instead of a combination of gen7_emit_cs_stall_flush and brw_emit_mi_flush calls (Curro) v3: no need to const cache invalidation (Curro) Reviewed-by: Francisco Jerez <[email protected]>
* nv30: go back to not using viewport validate function for swtnlIlia Mirkin2016-07-012-1/+16
| | | | | | | | | The output of draw requires a null viewport transform, which the regular code is ill-equiped to do. Reinstate the original settings in the render path, and add setting of the viewport clip polygon based on fb width/height (as that is all taken care of by draw). Signed-off-by: Ilia Mirkin <[email protected]>
* nv30: fix viewport clipping settings to be based on viewport, not rtIlia Mirkin2016-07-012-17/+11
| | | | | | | This fixes a ton of "*clip*" dEQP GLES2 tests, as well as triangle-guardband-viewport in piglit. Signed-off-by: Ilia Mirkin <[email protected]>
* gallium/util: check for window cliprects in util_can_blit_via_copy_region()Brian Paul2016-06-301-0/+1
| | | | | | We can't blit with resource_copy_region() if there are window clip rects. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: Force blend color to 16-byte alignmentChuck Atkins2016-06-301-1/+11
| | | | | | | | | | | | | | | This aligns the 4-element color float array to 16 byte boundaries. This should allow compiler vectorizers to generate better optimizations. Also fixes broken vectorization generated by Intel compiler. v2: Fixed indentation and added a lengthy comment explaining the reason for the alignment. Cc: <[email protected]> Reported-by: Tim Rowley <[email protected]> Tested-by: Tim Rowley <[email protected]> Signed-off-by: Chuck Atkins <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* swr: Refactor checks for compiler feature flagsChuck Atkins2016-06-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Encapsulate the test for which flags are needed to get a compiler to support certain features. Along with this, give various options to try for AVX and AVX2 support. Ideally we want to use specific instruction set feature flags, like -mavx2 for instance instead of -march=haswell, but the flags required for certain compilers are different. This allows, for AVX2 for instance, GCC to use -mavx2 -mfma -mbmi2 -mf16c while the Intel compiler which doesn't support those flags can fall back to using -march=core-avx2. This addresses a bug where the Intel compiler will silently ignore the AVX2 instruction feature flags and then potentially fail to build. v2: Pass preprocessor-check argument as true-state instead of false-state for clarity. v3: Reduce AVX2 define test to just __AVX2__. Additional defines suchas __FMA__, __BMI2__, and __F16C__ appear to be inconsistently defined w.r.t thier availability. v4: Fix C++11 flags being added globally and add more logic to swr_require_cxx_feature_flags Cc: <[email protected]> Reviewed-by: Tim Rowley <[email protected]> Tested-by: Tim Rowley <[email protected]> Signed-off-by: Chuck Atkins <[email protected]>
* st/wgl: make own_mutex() non-staticBrian Paul2016-06-302-4/+7
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* glsl: atomic counters are different than their uniformsAndres Gomez2016-06-301-37/+37
| | | | | | | | | | The linker deals with atomic counters in terms of uniforms but the data structure are called after the atomic counters. Renamed the data structures used in the linker for disambiguation. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Andres Gomez <[email protected]>
* glsl: count atomic counters correctlyAndres Gomez2016-06-301-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the linker uses the uniform count for the total number of atomic counters. However uniforms don't include the innermost array dimension in their count, but atomic counters are expected to include them. Although the spec doesn't directly state this, it's clear how offsets will be assigned for arrays. From OpenGL 4.2 (Core Profile), page 98: " * Arrays of type atomic_uint are stored in memory by element order, with array element member zero at the lowest offset. The difference in offsets between each pair of elements in the array in basic machine units is referred to as the array stride, and is constant across the entire array. The stride can be queried by calling GetIntegerv with a pname of ATOMIC_COUNTER_- ARRAY_STRIDE after a program is linked." From that it is clear how arrays of atomic counters will interact with GL_MAX_ATOMIC_COUNTER_BUFFER_SIZE. For other kinds of uniforms it's also clear that each entry in an array counts against the relevant limits. Hence, although inferred, this is the expected behavior. Fixes GL44-CTS.arrays_of_arrays_gl.AtomicDeclaration Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Andres Gomez <[email protected]>
* svga: use SVGA3D_vgpu10_BufferCopy() for buffer copiesBrian Paul2016-06-301-4/+28
| | | | | | | | | | So that we do copies host-side rather than in the guest with map/memcpy. Tested with piglit arb_copy_buffer-subdata-sync test and new arb_copy_buffer-intra-buffer-copy test. Reviewed-by: Charmaine Lee <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* svga: add SVGA3D_vgpu10_BufferCopy()Brian Paul2016-06-302-0/+30
| | | | | Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: flush buffers when mapping for readingBrian Paul2016-06-301-13/+24
| | | | | | | | | | | | | | | With host-side buffer copies (via SVGA3D_vgpu10_BufferCopy()) we have to make sure any pending map-write operations are completed before reading if the buffer is dirty. Otherwise the ReadbackSubResource operation could get stale data from the host buffer. This allows the piglit arb_copy_buffer-subdata-sync test to pass when we start using the SVGA3D_vgpu10_BufferCopy command. v2: check the sbuf->dirty flag in the outer conditional, per Charmaine. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: enable ARB_copy_image extension in the driverNeha Bhende2016-06-301-1/+2
| | | | | | Reviewed-by: Brian Paul <[email protected]> Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: try blitting with copy region in more casesBrian Paul2016-06-301-1/+7
| | | | | | | | | | We previously could do blits with util_resource_copy_region() when doing 'loose' format checking. Also do blits with util_resource_copy_region() when the blit src/dst formats (not the underlying resources) exactly match. Needed for GL_ARB_copy_image. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: use copy_region_vgpu10() for region copies when possibleBrian Paul2016-06-301-4/+37
| | | | | | | v2: remove extra svga_define_texture_level() call, per Charmaine. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: use vgpu10 CopyRegion command when possibleNeha Bhende2016-06-301-2/+147
| | | | | | | | | Do texture->texture copies host-side with this command when possible. Use the previous software fallback otherwise. Reviewed-by: Brian Paul <[email protected]> Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: set render target flag for snorm surfacesBrian Paul2016-06-301-0/+10
| | | | | | | | | We don't normally support rendering to SNORM surfaces, but with GL_ARB_copy_image we can copy to them if we treat them as typeless and use a UNORM surface view. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: add new svga_format_is_uncompressed_snorm() helperBrian Paul2016-06-302-0/+24
| | | | | Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: adjust sampler view format for RGBXBrian Paul2016-06-301-1/+5
| | | | | | | | We previously handled the case of a RGBX sampler view of a RGBA surface. Add the reverse case too. For GL_ARB_copy_image. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: adjust render target view format for RGBXBrian Paul2016-06-301-1/+13
| | | | | | | | For GL_ARB_copy_image we may be asked to create an RGBA view of a RGBX surface. Use an RGBX view format for that case. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: don't advertise support for R32G32B32_UINT/SINT surface formatsNeha Bhende2016-06-301-2/+2
| | | | | | | | | | | | | | | | | | We want to be able to copy between different 32-bit, 3-channel surface formats for GL_ARB_copy_image but since we don't support R32G32B32_FLOAT for textures (it's not blendable and wouldn't work for render to texture) we can't support 32-bit, 3-channel integer formats. The state tracker will choose 4-channel formats instead. Fixes the piglit arb_copy_image-format test for several cases. Note: This change may need to be revisited if/when the texture_view exension is enabled in driver. Reviewed-by: Brian Paul <[email protected]> Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: use untyped surface formats in most casesBrian Paul2016-06-301-4/+7
| | | | | | | | | This allows us to do copies between different, but compatible, surface formats such as RGBA8_UNORM, RGBA8_SINT, RGBA8_UINT, etc. for GL_ARB_copy_image. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* gallium/util: add tight_format_check param to util_can_blit_via_copy_region()Brian Paul2016-06-302-11/+30
| | | | | | | | The VMware driver will use this for implementing GL_ARB_copy_image. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* gallium/util: simplify a few things in util_can_blit_via_copy_region()Brian Paul2016-06-301-12/+8
| | | | | | | | | Since only the src box can have negative dims for flipping, just comparing the src/dst box sizes is enough to detect flips. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* gallium/util: new util_try_blit_via_copy_region() functionBrian Paul2016-06-302-15/+32
| | | | | | | | | Pulled out of the util_try_blit_via_copy_region() function. Subsequent changes build on this. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>