summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* nir/algebraic: Simplify fsat of fsignIan Romanick2018-10-091-0/+1
| | | | | | | | | | These allows us to not support fsign.sat in the Intel compiler backend, and that will simplify some later changes. No shader-db changes on any Intel platform. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* nir/algebraic: sign(x)*x*x is abs(x)*xIan Romanick2018-10-091-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | shader-db results: All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15106023 -> 15105981 (<.01%) instructions in affected programs: 300 -> 258 (-14.00%) helped: 6 HURT: 0 helped stats (abs) min: 7 max: 7 x̄: 7.00 x̃: 7 helped stats (rel) min: 14.00% max: 14.00% x̄: 14.00% x̃: 14.00% 95% mean confidence interval for instructions value: -7.00 -7.00 95% mean confidence interval for instructions %-change: -14.00% -14.00% Instructions are helped. total cycles in shared programs: 566050327 -> 566050075 (<.01%) cycles in affected programs: 2826 -> 2574 (-8.92%) helped: 6 HURT: 0 helped stats (abs) min: 40 max: 44 x̄: 42.00 x̃: 42 helped stats (rel) min: 8.89% max: 8.94% x̄: 8.92% x̃: 8.92% 95% mean confidence interval for cycles value: -44.30 -39.70 95% mean confidence interval for cycles %-change: -8.95% -8.88% Cycles are helped. No changes on Gen6 or earlier. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* nir: Add helper functions to get the instruction that generated a nir_srcIan Romanick2018-10-091-0/+23
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* svga: change svga_destroy_shader_variant() to return voidBrian Paul2018-10-095-23/+6
| | | | | | | | | | | | | svga_destroy_shader_variant() itself flushes and retries the command if there's a failure. So no need for the callers to do it. Other callers of the function were already ignoring the return value. This also fixes a corner-case double-free reported by Coverity (and reported by Dave Airlie). Tested with various OpenGL apps. Reviewed-by: Charmaine Lee <[email protected]>
* meson: Don't build glsl compiler tests unless OpenGL is enabledDylan Baker2018-10-092-2/+2
| | | | | | Since there are no other users of the glsl compiler. Reviewed-by: Eric Engestrom <[email protected]>
* meson: Only build gallium state tracker tests with shared_glapiDylan Baker2018-10-091-1/+1
| | | | | | | This has always been a requirement, it's just somehow been missed in the meson build. Reviewed-by: Eric Engestrom <[email protected]>
* meson: only build clapi tests when OpenGL is being builtDylan Baker2018-10-092-2/+2
| | | | | | | | Otherwise building just vulkan (among other things) will build these tests, pull in a bunch of stuff they shouldn't, and potentially fail to compile. Reviewed-by: Eric Engestrom <[email protected]>
* nvc0: fix blitting red to srgb8_alphaIlia Mirkin2018-10-091-0/+4
| | | | | | | | | | | | | For some reason the 2d engine can't handle this. Red formats get special treatment there, so perhaps related. Fixes dEQP-GLES3 tests of the form: dEQP-GLES3.functional.fbo.blit.conversion.r{8,16f,32f}_to_srgb8_alpha8 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Cc: [email protected]
* nv50,nvc0: guard against zero-size blitsIlia Mirkin2018-10-092-0/+14
| | | | | | | | | | The current state tracker can generate these sometimes. Fixing this is more involved, and due to some integer math we can generate divisions-by-zero. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Cc: [email protected]
* nv50,nvc0: mark RGBX_UINT formats as renderableIlia Mirkin2018-10-091-4/+4
| | | | | | | | | | | | | | | | This helps st/mesa avoid some (apparently) buggy fallbacks. Specifically the CopyTexSubImage fallback tries to read texture A as RGBA_FLOAT and write back that data into the target format, which fails for integer formats which have no appropriate logic to do the conversion. Since integer formats don't blend, there's no harm in the fact that the "A" component gets written anyways. Fixes, among others: https://www.khronos.org/registry/webgl/sdk/tests/conformance2/textures/canvas/tex-2d-rgb8ui-rgb_integer-unsigned_byte.html Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* radv: add missing meson c++ visibility argumentsEric Engestrom2018-10-091-0/+1
| | | | | | | | Fixes: 6f3aee40f90d725653b6 "radv: using tls to store llvm related info and speed up compiles (v10)" Cc: Dave Airlie <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* gbm: Add GBM_FORMAT_ARGB1555 supportMichel Dänzer2018-10-091-0/+4
| | | | Reviewed-by: Marek Olšák <[email protected]>
* st/dri: Handle BGRA5551 formatMichel Dänzer2018-10-091-0/+13
| | | | Reviewed-by: Marek Olšák <[email protected]>
* freedreno/a5xx+a6xx: fix LRZ pitch alignmentRob Clark2018-10-081-1/+1
| | | | | | | Both RB_2D_DST_SIZE.PITCH (a6xx) and RB_MRT[n].PITCH (a5xx) need alignment to 64. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: add LRZ supportRob Clark2018-10-088-132/+104
| | | | | | | | | | As with a5xx, hidden behind FD_MESA_DEBUG=lrz due to being paranoid about z-fighting issues with some games (in particular, this was observed with 0ad on a5xx.. but I think the proper solution to enable this by default is to figure out how to do driver specific driconf options). Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2018-10-087-38/+120
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: add helper for various CP_EVENT_WRITERob Clark2018-10-085-38/+30
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: remove unused fxnsRob Clark2018-10-082-19/+0
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: remove fd6_shader_stateobjRob Clark2018-10-083-23/+10
| | | | | | | Earlier gen's already got this cleanup, but a6xx was still off on a branch then. Signed-off-by: Rob Clark <[email protected]>
* glsl: fix array assignments of a swizzled vectorIlia Mirkin2018-10-081-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | This happens in situations where we might do vec.wzyx[i] = ... The swizzle would get effectively ignored because of the interaction between how ir_assignment->set_lhs works and overwriting the write_mask. There are two cases, one where i is a constant, and another where i is variable. We have to be extra-careful in both cases. Fixes the following WebGL test: https://www.khronos.org/registry/webgl/sdk/tests/conformance2/glsl3/vector-dynamic-indexing-swizzled-lvalue.html And the new piglit tests: swizzled-writemask-indexing-nonconst.shader_test swizzled-writemask-indexing.shader_test Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: [email protected]
* radv: tidy up radv_pipeline_init_multisample_state()Samuel Pitoiset2018-10-081-19/+16
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always set PA_SC_MODE_CNTL_1.OUT_OF_ORDER_WATER_MARKSamuel Pitoiset2018-10-081-2/+2
| | | | | | | | It has probably no effect without out of order rasterization anyway. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set DB_EQAA.INCOHERENT_EQAA_READSSamuel Pitoiset2018-10-081-1/+1
| | | | | | | | My attempt was to set this field instead of duplicating one. Fixes: 6cfa321c39 ("radv: add potential missing fields for DB_EQAA") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* i965: fallback RGBX to RGBA in glEGLImageTargetRenderbufferStorageOESChystiakov, Dmytro2018-10-081-26/+37
| | | | | | | | | | | | | | | In the same fashion as is done for glEGLImageTextureTarget2D. v2: share the fallback which sets baseformat and internalformat correctly which makes both of the tests pass (Tapani) Fixes android.hardware.nativehardware.cts.AHardwareBufferNativeTests: #SingleLayer_ColorTest_GpuColorOutputCpuRead_R8G8B8X8_UNORM #SingleLayer_ColorTest_GpuColorOutputIsRenderable_R8G8B8X8_UNORM Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* glsl: do not attempt assignment if operand type not parsed correctlyTapani Pälli2018-10-081-0/+6
| | | | | | | | | v2: check types of both operands (Ian) Cc: [email protected] Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108012
* util/u_queue: add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITYMarek Olšák2018-10-064-2/+20
| | | | | Initial version discussed with Rob Clark under a different patch name. This approach leaves his driver unaffected.
* radeonsi: fix a typo at CS_PARTIAL_FLUSHMarek Olšák2018-10-061-1/+1
| | | | harmless
* ac: add ac_build_roundMarek Olšák2018-10-064-6/+20
|
* ac: correct PKT3_COPY_DATA definitionsMarek Olšák2018-10-067-15/+22
|
* ac: simplify LLVM alloca helpersMarek Olšák2018-10-061-7/+4
|
* ac: define all address spaces properlyMarek Olšák2018-10-065-14/+16
|
* gallivm: Make it possible to disable some optimization shortcuts in release ↵Gert Wollny2018-10-064-21/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | builds For testing it is of interest that all tests of dEQP pass, e.g. to test virglrenderer on a host only providing software rendering like in a CI. Hence make it possible to disable certain optimizations that make tests fail. While we are there also add some documentation to the flags to make it clear that this is opt-out. Setting the environment variable "GALLIVM_PERF=no_filter_hacks" can be used to make the following tests pass in release mode: dEQP-GLES2.functional.texture.mipmap.2d.affine.*_linear_* dEQP-GLES2.functional.texture.mipmap.cube.generate.* dEQP-GLES2.functional.texture.vertex.2d.filtering.*_mipmap_linear_* dEQP-GLES2.functional.texture.vertex.2d.wrap.* Related: https://bugs.freedesktop.org/show_bug.cgi?id=94957 v2: rename optimization disabling flag to 'safemath' and also move the nopt flag to the perf flags. v3: rename flag "safemath" to "no_filter_hacks" since safemath is usually associated with floating point operations (Roland) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* virgl: Pass resource size and transfer offsetsTomeu Vizoso2018-10-064-28/+208
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass the size of a resource when creating it so a backing can be kept in the other side. Also pass the required offset to transfer commands. This moves vtest closer to how virtio-gpu works, making it more useful for testing. v2: - Use new messages for creation and transfers, as changing the behavior of the existing messages would be messy given that we don't want to break compatibility with older servers. v3: - Use correct strides: The resource corresponding to the output display might have a differnt line stride then the IOVs, so when reading back to this resource take the resource stride and the the IOV stride into account. v4: Fix transfer size calculation (Andrey Simiklit) v5: Add comment about transfer size value in the PUT commend (Gurchetan). Add a comment about the size correction for transfers for reading and writing the resource. Fixing this by correctly evaluating the size upfront will need some work also on the virglrenderer side. Signed-off-by: Tomeu Vizoso <[email protected]> (v2) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl, vtest: Correct the transfer size calculationGert Wollny2018-10-061-1/+3
| | | | | | | | | | | The transfer size used in virglrenderer refers to uint32_t, so one must add 3 and then divide by 4 instead of adding 3/4 which is a no-op with integers. Fixes: b3b82fe8ea virgl/vtest: add vtest driver Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* util: Make xmlconfig.c build on Solaris without d_type in dirent (v2)Alan Coopersmith2018-10-051-0/+8
| | | | | | | | | v2: check for lstat() failing Fixes: 04bdbbcab3c "xmlconfig: read more config files from drirc.d/" Signed-off-by: Alan Coopersmith <[email protected]> Reviewed-by: Roland Mainz <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* radeonsi:optimizing SET_CONTEXT_REG for shaders vgt_vertex_reuseSonny Jiang2018-10-054-2/+18
| | | | | Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi:optimizing SET_CONTEXT_REG for shaders TessellationSonny Jiang2018-10-054-5/+26
| | | | | Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi:optimizing SET_CONTEXT_REG for shaders PSSonny Jiang2018-10-053-14/+60
| | | | | Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi:optimizing SET_CONTEXT_REG for shaders VSSonny Jiang2018-10-053-33/+77
| | | | | Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi:optimizing SET_CONTEXT_REG for shaders GSSonny Jiang2018-10-054-24/+154
| | | | | Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: optimize and allow reg > 31 in radeon_opt_set_context_reg functionsMarek Olšák2018-10-051-22/+12
| | | | | | | reg_saved will have 64 bits, and (1 << reg) where reg > 31 has undefined behavior. (1ull << reg) would be correct for 64 bits. This commit shifts the other way in order to merge the conditions.
* radeonsi: optimizing SET_CONTEXT_REG for shaders ESSonny Jiang2018-10-055-10/+37
| | | | | Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* spirv: mark variables decorated with XfbBuffer as always activeSamuel Pitoiset2018-10-051-0/+1
| | | | | | | | Otherwise, they are removed during NIR linking or in some lowering passes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/alu_to_scalar: Use ssa_for_alu_src in hand-rolled expansionsJason Ekstrand2018-10-041-15/+18
| | | | | | | | | | | | | | | | | The ssa_for_alu_src helper will correctly handle swizzles and other source modifiers for you. The expansions for unpack_half_2x16, pack_uvec2_to_uint, and pack_uvec4_to_uint were all broken with regards to swizzles. The brokenness of unpack_half_2x16 was causing rendering errors in Rise of the Tomb Raider on Intel ever since c11833ab24dcba26 which added an extra copy propagation to the optimization pipeline and caused us to start seeing swizzles where we hadn't seen any before. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107926 Fixes: 9ce901058f3d "nir: Add lowering of nir_op_unpack_half_2x16." Fixes: 9b8786eba955 "nir: Add lowering support for packing opcodes." Tested-by: Alex Smith <[email protected]> Tested-by: Józef Kucia <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl/linker: Check the subroutine associated functions namesVadym Shovkoplias2018-10-041-0/+40
| | | | | | | | | | | | | | | | | | | | | >From Section 6.1.2 (Subroutines) of the GLSL 4.00 specification "A program will fail to compile or link if any shader or stage contains two or more functions with the same name if the name is associated with a subroutine type." v2: - error out earlier (Tapani) - style fixes (Iago) Fixes: * no-overloads.vert Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108109 Signed-off-by: Vadym Shovkoplias <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* virgl: Negotiate version with vtest serverTomeu Vizoso2018-10-043-0/+64
| | | | | | | | | | | | | | | | | | | Check if server supports version negotation by sending a PING_PROTOCOL_VERSION message right before a dummy RESOURCE_BUSY_WAIT. If we don't get a reply for the first, we know the server doesn't support it. If it does support it, we can query the max protocol version supported by the server and fall back if needed. v2: - Send a new message to negotiate the protocol version, checking if the server supports this message by immediately sending a busy wait message. (Dave Airlie) v3: - Send a zero-arg command PING_PROTOCOL_VERSION so we actually keep compatibility with older servers. (Code by Dave Airlie) Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* intel: aubinator: Fix memory leaksSagar Ghuge2018-10-041-0/+25
| | | | | Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: construct correct xml filenameSagar Ghuge2018-10-041-8/+7
| | | | | | | | | | construct correct gen xml filename when we try to load hardware xml description from a given path v2: remove temporary variable (Francesco Ansanelli) Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Avoid freeing invalid pointerSagar Ghuge2018-10-041-5/+13
| | | | | | | | | v2: Free ctx.spec if error while reading genxml (Lionel Landwerlin) v3: Handle case where genxml is empty (Lionel Landwerlin) Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: add gen_spec_init methodSagar Ghuge2018-10-041-16/+35
| | | | | | | | | | Initialize gen_spec instance properly when loading hardware xml description from specifc directory to avoid segmentation fault. v2: correct function definition (Lionel Landwerlin) Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>