summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/a5xx: align height to GMEMRob Clark2017-10-021-1/+5
| | | | | | | | | | | | | | | | | Similar to the way width/pitch alignment works, it seems like we need to do similar for height. Otherwise the BLIT from system memory to GMEM can over-fetch beyond the end of the buffer, triggering a fault. I'm not sure if there is a better solution yet. Possibly we could fall back to pre-a5xx style DRAW packets for cases where BLIT might over- fetch. (We in theory have that problem already with rendering to higher mipmap levels, although fortunately those tend to use GMEM bypass.) This fixes issues reported with glamor. Reported-by: [email protected] Cc: 17.2 <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* radeonsi: adjust clip discard based on line width / point sizeNicolai Hähnle2017-10-023-11/+27
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: remove si_context::{scissor_enabled,clip_halfz}Nicolai Hähnle2017-10-023-26/+24
| | | | | | They are just copies of the rasterizer state. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: simplify the signature of si_update_vs_writes_viewport_indexNicolai Hähnle2017-10-023-7/+6
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move current_rast_prim into si_contextNicolai Hähnle2017-10-026-15/+11
| | | | | | v2: rebase fixes Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move and rename scissor and viewport state and functionsNicolai Hähnle2017-10-0210-182/+184
| | | | | | v2: change GET_MAX_SCISSOR to SI_MAX_SCISSOR Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: remove si_apply_scissor_bug_workaroundNicolai Hähnle2017-10-022-19/+0
| | | | | | It only affects pre-SI chips. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move r600_viewport.c to si_viewport.cNicolai Hähnle2017-10-023-2/+2
| | | | | | | This is purely a file-move + #include fixup + build system changes. Other cleanups will follow in subsequent commits. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix maximum advertised point size / line widthNicolai Hähnle2017-10-022-8/+3
| | | | | | | | | | The hardware registers store the half-size/width in 12.4 fixed point format, so 8192 is the maximum. Fixes dEQP-GLES3.functional.rasterization.* Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: deduce rast_prim correctly for tessellation point modeNicolai Hähnle2017-10-021-3/+6
| | | | | | | | Together with the previous patches, this fixes dEQP-GLES31.functional.primitive_bounding_box.wide_points.* Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: don't discard points and linesNicolai Hähnle2017-10-022-2/+26
| | | | | | | | | This is a bit conservative, but a more precise solution requires access to the rasterizer state. This is something to tackle after the fork between r600 and radeonsi. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move current_rast_prim to r600_common_contextNicolai Hähnle2017-10-025-9/+13
| | | | | | | | | | We'll use it in the scissors / clip / guardband state. v2: avoid a performance regression on r600 when applied to (pre-fork) stable branches Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_FORMAT_R10G10B10X2_UNORMNicolai Hähnle2017-10-024-0/+10
| | | | Reviewed-by: Marek Olšák <[email protected]>
* freedreno: fix PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVERob Clark2017-10-022-4/+5
| | | | | | | | Fixes an assert in fd_acc_query_register_provider() about query provider not already registered. Fixes: 3f6b3d9d ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE") Signed-off-by: Rob Clark <[email protected]>
* radeonsi: fix a regression in integer cube map handlingNicolai Hähnle2017-10-021-8/+26
| | | | | | | | | | | | A recent commit fixed the case of 8888 integer cube maps, which need the workaround of replacing the data format with USCALED/SSCALED. However, this broke the case of non-8888 integer cube maps; those still need the fix of shifting the texture coordinates. Fixes KHR-GL45.texture_gather.plain-gather-int-cube-array and similar. Fixes: 6fb0c1013b35 ("radeonsi: workaround for gather4 on integer cube maps") Reviewed-by: Marek Olšák <[email protected]>
* amd/common: move ac_build_phi from radeonsiNicolai Hähnle2017-10-021-17/+3
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: don't use the template keywordMarek Olšák2017-09-301-7/+7
| | | | | | for C++ editors Reviewed-by: Brian Paul <[email protected]>
* gallium/vl: don't use the template keywordMarek Olšák2017-09-301-14/+14
| | | | | | for C++ editors Reviewed-by: Brian Paul <[email protected]>
* radeonsi/uvd: clean up si_video_buffer_createBenedikt Schemmer2017-09-301-30/+17
| | | | | | V2: remove code duplication and one unnessecary variable, minor whitespace fix Signed-off-by: Marek Olšák <[email protected]>
* radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bbMarek Olšák2017-09-301-3/+8
| | | | | Tested-by: Benedikt Schemmer <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium: add new LOD opcodeRoland Scheidegger2017-09-305-5/+74
| | | | | | | | | | The operation performed is all the same as LODQ, but with the usual differences between dx10 and GL texture opcodes, that is separate resource and sampler indices (plus result swizzling, and setting z/w channels to zero). Reviewed-by: Jose Fonseca <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* st/va: add dst rect to avoid scale on deintLeo Liu2017-09-291-6/+6
| | | | | | | | | | | For 1080p video transcode, the height will be scaled to 1088 when deint to progressive buffer. Set dst rect to make sure no scale. Fixes: 3ad8687 "st/va: use new vl_compositor_yuv_deint_full() to deint" Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Acked-by: Andy Furniss <[email protected]>
* radeonsi: emit DLDEXP and DFRACEXP TGSI opcodesNicolai Hähnle2017-09-292-1/+26
| | | | | | | | | Note: this causes spurious regressions in some current piglit tests, because the tests incorrectly assume that there is no denorm support for doubles. I'm going to send out a fix for those tests as well. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: emit LDEXP opcodeNicolai Hähnle2017-09-292-1/+3
| | | | | | | | The LLVM intrinsic has existed for a long time. The current name was established in LLVM 3.9. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallium: add LDEXP TGSI instruction and corresponding capNicolai Hähnle2017-09-2920-3/+50
| | | | | Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* tgsi: infer that dst[1] of DFRACEXP is an integerNicolai Hähnle2017-09-295-6/+9
| | | | | Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallivm: add support for TGSI instructions with two outputsNicolai Hähnle2017-09-293-1/+31
| | | | | Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallivm: add dst register index to lp_build_tgsi_context::emit_storeNicolai Hähnle2017-09-296-20/+27
| | | | | Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* tgsi: clarify the semantics of DFRACEXPNicolai Hähnle2017-09-294-22/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The status quo is quite the mess: 1. tgsi_exec will do a per-channel computation, and store the dst[0] result (significand) correctly for each channel. The dst[1] result (exponent) will be written to the first bit set in the writemask. So per-component calculation only works partially. 2. r600 will only do a single computation. It will replicate the exponent but not the significand. 3. The docs pretend that there's per-component calculation, but even get dst[0] and dst[1] confused. 4. Luckily, st_glsl_to_tgsi only ever emits single-component instructions, and kind-of assumes that everything is replicated, generating this for the dvec4 case: DFRACEXP TEMP[0].xy, TEMP[1].x, CONST[0][0].xyxy DFRACEXP TEMP[0].zw, TEMP[1].y, CONST[0][0].zwzw DFRACEXP TEMP[2].xy, TEMP[1].z, CONST[0][1].xyxy DFRACEXP TEMP[2].zw, TEMP[1].w, CONST[0][1].zwzw Settle on the simplest behavior, which is single-component calculation with replication, document it, and adjust tgsi_exec and r600. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* tgsi: fix the documentation of DLDEXPNicolai Hähnle2017-09-291-1/+1
| | | | | | | | | Sourcing the exponent for the zw destination pair from Z is consistent with both tgsi_exec and gallivm. In practice, st_glsl_to_tgsi always generates per-channel instructions anyway. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* tgsi: infer that DLDEXP's second source has an integer typeNicolai Hähnle2017-09-294-7/+11
| | | | | Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600: cleanup set_occlusion_query_stateNicolai Hähnle2017-09-293-14/+3
| | | | | | | | | | | This fixes a warning caused by the fork (note the change in the function signature): ../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c: In function ‘r600_init_common_state_functions’: ../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c:2974:36: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types] rctx->b.set_occlusion_query_state = r600_set_occlusion_query_state; Reviewed-by: Marek Olšák <[email protected]>
* r300: add missing case PIPE_SHADER_CAP_INT64_ATOMICSNicolai Hähnle2017-09-291-0/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix border color translation for integer texturesNicolai Hähnle2017-09-293-29/+60
| | | | | | | | | | This fixes the extremely unlikely case that an application uses 0x80000000 or 0x3f800000 as border color for an integer texture and helps in the also, but perhaps slightly less, unlikely case that 1 is used as a border color. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: clamp border colors for upgraded depth texturesNicolai Hähnle2017-09-291-59/+60
| | | | | | | | | | | | | The hardware does this automatically for unorm formats, but we need to do it manually for unorm depth formats that have been upgraded to Z32_FLOAT. Fixes dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth and others. Fixes: d4d9ec55c589 ("radeonsi: implement TC-compatible HTILE") Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: clamp depth comparison value only for fixed point formatsNicolai Hähnle2017-09-296-14/+53
| | | | | | | | | | | | | | | | | | | The hardware usually does this automatically. However, we upgrade depth to Z32_FLOAT to enable TC-compatible HTILE, which means the hardware no longer clamps the comparison value for us. The only way to tell in the shader whether a clamp is required seems to be to communicate an additional bit in the descriptor table. While VI has some unused bits in the resource descriptor, those bits have unfortunately all been used in gfx9. So we use an unused bit in the sampler state instead. Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f and many other tests in dEQP-GLES3.functional.texture.shadow.* Fixes: d4d9ec55c589 ("radeonsi: implement TC-compatible HTILE") Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi/gfx9: fix geometry shaders without output verticesNicolai Hähnle2017-09-291-3/+5
| | | | | | | | | | | Not that those are super common or useful, but hey! Fun corner cases of the API... Fixes dEQP-GLES31.functional.geometry_shading.emit.* Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: move descriptor logs to after corresponding draw/compute packetNicolai Hähnle2017-09-292-8/+6
| | | | | | | | It has to happen after descriptor uploads since otherwise we'll print out the wrong GPU list / incorrectly claim descriptor corruption. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: remove ac_shader_abi::chip_classNicolai Hähnle2017-09-291-2/+0
| | | | | | | Redundant with the recently added ac_llvm_context::chip_class. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: fix a commentNicolai Hähnle2017-09-291-1/+1
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* svga: add missing PIPE_SHADER_CAP_INT64_ATOMICS switch casesBrian Paul2017-09-281-0/+2
| | | | | | Silences a compiler warning. Reviewed-by: Roland Scheidegger <[email protected]>
* svga: trivial whitespace clean-ups in svga_screen.cBrian Paul2017-09-281-11/+13
|
* gallium/util: use new util_vasprintf() functionBrian Paul2017-09-281-1/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* svga: start advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTIONNeha Bhende2017-09-281-1/+3
| | | | | | | | | | | Since our driver support arb_provoking_vertex, we can start advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION Fixes ./clipflat & ./arb-provoking-vertex-render piglit tests Tested piglit, glretrace on Hw 11 and Hw 13 Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* etnaviv: optimize RS transfersLucas Stach2017-09-281-4/+25
| | | | | | | | | | | | | Currently we are blitting the whole resource when the RS is used to de-/tile a resource. This can be very inefficient for large resources where the transfer is only changing a small part of the resource (happens a lot with glTexSubImage2D). Optimize this by only blitting the tile aligned subregion of the resource, which the transfer is going to change. Signed-off-by: Lucas Stach <[email protected]> Reviewed-By: Wladimir J. van der Laan <[email protected]>
* etnaviv: add resource subregion copyLucas Stach2017-09-282-0/+32
| | | | | | | | This is useful if we only need to copy part of a larger resource, mostly when using the RS engine to de-/tile on pipe transfers. Signed-off-by: Lucas Stach <[email protected]> Reviewed-By: Wladimir J. van der Laan <[email protected]>
* etnaviv: support tile aligned RS blitsLucas Stach2017-09-281-8/+78
| | | | | | | | The RS can blit abitrary tile aligned subregions of a resource by adjusting the buffer offset. Signed-off-by: Lucas Stach <[email protected]> Reviewed-By: Wladimir J. van der Laan <[email protected]>
* st/va: use pipe transfer_map to map upload bufferLeo Liu2017-09-281-3/+9
| | | | | | | | | | The function pipe_buffer_map() is only for linear pipe buffer, with height as 0, and it's not for any 2D textures. Signed-off-by: Leo Liu <[email protected]> Cc: [email protected] Cc: Mark Thompson <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/docs: add reference links for resource_create methodGwan-gyeong Mun2017-09-281-2/+2
| | | | | | | It adds reference links for arguments usage and bind of resource_create(). Signed-off-by: Mun Gwan-gyeong <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/docs: fix a reference link for get_paramfGwan-gyeong Mun2017-09-281-1/+1
| | | | | | | | Previous get_paramf links same as get_param. It changes the reference link to PIPE_CAPF_* Signed-off-by: Mun Gwan-gyeong <[email protected]> Reviewed-by: Marek Olšák <[email protected]>