summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* docs: add news item and link release notes for 13.0.6/17.0.2Emil Velikov2017-03-202-0/+14
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: add sha256 checksums for 17.0.2Emil Velikov2017-03-201-1/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 9b66351f5b274f3d79cb2c48afa3b2fcc2bf3442)
* docs: add release notes for 17.0.2Emil Velikov2017-03-201-0/+184
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 373d88a7117150de984510453e1c30a455987686)
* docs: add sha256 checksums for 13.0.6Emil Velikov2017-03-201-1/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 879d24c49727cfc6c62cbd5bca58efad4c914e40)
* docs: add release notes for 13.0.6Emil Velikov2017-03-201-0/+286
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit fcef88d13a9ebdcadc6a878e9284c55651785301)
* anv/genX: Solve the vkCreateGraphicsPipelines crashXu,Randy2017-03-201-2/+2
| | | | | | | | | | | | The crash is due to NULL pColorBlendState, which is legal if the pipeline has rasterization disabled or if the subpass of the render pass the pipeline is created against does not use any color attachments. Test: Sample subpasses from LunarG can run without crash Signed-off-by: Xu,Randy <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.0 13.0" <[email protected]>
* radv: fix logic for when to flush on multiple CS emissionDave Airlie2017-03-201-8/+8
| | | | | | | | | The current code evaluated to always true, we only want to flush on the first submit. Rename the variable to do_flush, and only emit on the first iteration. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* spirv: Implement IsInf using an integer comparisonJason Ekstrand2017-03-201-1/+1
| | | | | | | | | | | Since we already do fabs on the one source, we're guaranteed to get positive infinity if we get any infinity at all. Since +inf only has one IEEE 754 representation, we can use an integer comparison and avoid all of the ordered/unordered issues. Cc: Dave Airlie <[email protected]> Reviewed-by: Elie Tournier <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv/meta: fix image clears for r4g4 format.Dave Airlie2017-03-201-0/+8
| | | | | | | This just uses an 8-bit clear and packs the values. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* Revert "radv: fallback to an in-memory cache when no pipline cache is provided"Dave Airlie2017-03-203-13/+6
| | | | | | | | | | | This reverts commit 2845a108a9a8bd4b0e6e9b590c976452fb99eb10. This break VK-GL-CTS randomly. ./deqp-vk --deqp-case=dEQP-VK.texture.filtering.3d.formats.r4g4b4a4* bounces around here from 6/6 to 3/6 or 4/6 to hanging. Signed-off-by: Dave Airlie <[email protected]>
* mesa: disable glthread when glNewList() is calledTimothy Arceri2017-03-201-1/+1
| | | | | | | glNewList() swaps dispatch tables, and we don't have anything in place to handle that in glthread. Tested-by: Michel Dänzer <[email protected]>
* radv: fix primitive reset index emissionDave Airlie2017-03-201-1/+1
| | | | | | | | | | This was meant to be checking the index type to get the correct index not the last emitted one. This fixes: dEQP-VK.pipeline.input_assembly.primitive_restart.index_type_uint32.triangle_strip_with_adjacency Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "13.0 17.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* util/disk_cache: check rename resultGrazvydas Ignotas2017-03-201-2/+6
| | | | | | | | | I haven't seen this causing problems in practice, but for correctness we should also check if rename succeeded to avoid breaking accounting and leaving a .tmp file behind. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* util/disk_cache: delete .tmp if target existsGrazvydas Ignotas2017-03-201-1/+3
| | | | | | | | | | | At the time of target file check, .tmp file is already created and file lock is held, so we should remove the .tmp, like in other error paths. With this, piglit no longer leaves large amount of empty .tmp files behind, which waste directory entries and may interfere with eviction. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* util/disk_cache: fix stored_keys indexGrazvydas Ignotas2017-03-201-2/+2
| | | | | | | | | | | | | | It seems there is a bug because: - 20 bytes are compared, but only 1 byte stored_keys step is used - entries can overlap each other by 19 bytes - index_mmap is ~1.3M in size, but only first 64K is used With this fix for Deus Ex: - startup time (from launch to Feral logo): ~38s -> ~16s - disk_cache_has_key() hit rate: ~50% -> ~96% Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nv30: create uploader after pipe->screen is setIlia Mirkin2017-03-191-6/+6
| | | | | | Fixes crashes after recent upload rework. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: enable TEX_LZ and TXF_LZIlia Mirkin2017-03-183-4/+17
| | | | | | | | | There should be minimal gain, if any, for nvc0, but nv50 may end up noticing more often that the lod argument is uniform. This, in turn, will remove the need for some unnecessary transformations, which were being hit due to the checks being done pre-ssa. Signed-off-by: Ilia Mirkin <[email protected]>
* st/mesa: set result writemask based on ir typeIlia Mirkin2017-03-181-0/+1
| | | | | | | | | | This prevents textureQueryLevels, which maps as LODQ, from ending up with a xyzw writemask, which is illegal. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100061 Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* nvc0/ir: treat FMA like MAD for operand propagationKarol Herbst2017-03-181-0/+1
| | | | | | | | | | | | | | | | | | Helps mainly Feral-ported games, due to their use of fma() shader-db changes: total instructions in shared programs : 3901147 -> 3842505 (-1.50%) total gprs used in shared programs : 471258 -> 467359 (-0.83%) total local used in shared programs : 27405 -> 27361 (-0.16%) total bytes used in shared programs : 35749888 -> 35214176 (-1.50%) local gpr inst bytes helped 17 1829 4091 4091 hurt 4 44 3 3 Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* util/disk_cache: pass predicate functions file stats directly (v4)Alan Swanson2017-03-181-34/+21
| | | | | | | | | | | | | | | Since switching to LRU eviction the only user of these predicate functions now resolves directory entry stats itself so pass them directly saving calling fstat and strlen twice (and the expensive strlen is skipped entirely if access time is newer). v2: Update for empty cache dir detection changes v3: Fix passing string length to predicate with the +1 for NULL termination and also pass sb as pointer v4: Missed ampersand for passing sb as pointer Reviewed-by: Grazvydas Ignotas <[email protected]> Acked-by: Timothy Arceri <[email protected]>
* glsl: use set for copy propagation killsTimothy Arceri2017-03-181-37/+28
| | | | | | | | | | | | | | | | | Previously each time we saw a variable we just created a duplicate entry in the list. This is particularly bad for loops were we add everything twice, and then throw nested loops into the mix and the list was growing expoentially. This stops the glsl-vs-unroll-explosion test which has 16 nested loops from reaching the tests mem usage limit in this pass. The test now hits the mem limit in opt_copy_propagation_elements() instead. I suspect this was also part of the reason this pass can be so slow with some shaders. Reviewed-by: Thomas Helland <[email protected]>
* st/dri: wait for thread to finish before unbinding contextTimothy Arceri2017-03-181-0/+3
| | | | | | | | | | Fixes a bunch of piglit crashes that hit an assert() when trying to delete the framebuffer. The assert() was triggered because WinSysDrawBuffer was set to NULL before glDeleteFramebuffers() was called. Tested-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl: don't leak memory when trying to count loop iterationsTimothy Arceri2017-03-181-2/+3
| | | | | | Suggested-by: Damian Dixon <[email protected]> Reviewed-by: Elie Tournier <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99789
* genxml: Make MI_STORE_DATA_IMM have a single 64-bit data fieldJason Ekstrand2017-03-176-12/+6
| | | | | | This is way more convenient than having two separate dword fields. Reviewed-By: Lionel Landwerlin <[email protected]>
* anv: Turn on inherited queriesJason Ekstrand2017-03-171-1/+1
| | | | | | | It all just works since it's just a hardware register so we might as well turn it on. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Implement pipeline statistics queriesIlia Mirkin2017-03-174-12/+226
| | | | | | | | In the end, pipeline statistics queries look a lot like occlusion queries only with between 1 and 11 begin/end pairs being generated instead of just the one. Reviewed-By: Lionel Landwerlin <[email protected]>
* anv: Disable VF statistics for blorp and SOL memcpyJason Ekstrand2017-03-174-3/+18
| | | | | | | | | | | | | | | | | | | In order to get accurate statistics, we need to disable statistics for blits, clears, and the surface state memcpy at the top of each secondary command buffer. There are two possible approaches to this: 1) Disable before the blit/memcpy and re-enable afterwards 2) Move emitting 3DSTATE_VF_STATISTICS from initialization and make it part of pipeline state and then just disabale statistics before blits and memcpy operations. Emitting 3DSTATE_VF_STATISTICS should be fairly cheap so it doesn't really matter which path we take. We choose the second option as it's more consistent with the way the rest of the statistics are enabled and disabled. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/pipeline: Enable clipper statisticsJason Ekstrand2017-03-171-0/+1
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* genxml: s/Clipper Statistics Enable/Statistics Enable/Jason Ekstrand2017-03-175-5/+5
| | | | | | | It's in 3DSTATE_CLIP, so it doesn't really need the extra detail. This matches what we do for VS, FS, etc. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/query: Rework store_query_resultJason Ekstrand2017-03-171-15/+24
| | | | | | | | The new version is a nice GPU parallel to cpu_write_query_result and it nicely handles things like dealing with 32 vs. 64-bit offsets in the destination buffer. Reviewed-By: Lionel Landwerlin <[email protected]>
* anv/query: Break GPU query calculation into a helperJason Ekstrand2017-03-171-12/+18
| | | | Reviewed-By: Lionel Landwerlin <[email protected]>
* genxml: Add pipeline statistics registers on gen7+Jason Ekstrand2017-03-174-0/+176
| | | | Reviewed-By: Lionel Landwerlin <[email protected]>
* anv/query: Add a helper for writing a query pool resultJason Ekstrand2017-03-171-16/+17
| | | | Reviewed-By: Lionel Landwerlin <[email protected]>
* anv/query: Use a variable-length slot sizeJason Ekstrand2017-03-172-28/+33
| | | | | | | | Not all queries are the same. Even the two queries we support today require a different amount of data per slot. Once we introduce pipeline statistics queries, the size will vary wildly. Reviewed-By: Lionel Landwerlin <[email protected]>
* anv/query: Move the available bits to the frontJason Ekstrand2017-03-172-28/+19
| | | | | | | | We're about to make slots variable-length and always having the available bits at the front makes certain operations substantially easier once we do that. Reviewed-By: Lionel Landwerlin <[email protected]>
* anv/query: Let 32-bit values wrapJason Ekstrand2017-03-171-2/+0
| | | | | | | | | | | | | | | From the Vulkan 1.0.39 Specification: "If VK_QUERY_RESULT_64_BIT is not set and the result overflows a 32-bit value, the value may either wrap or saturate." So we can either clamp or wrap. Wrapping is both easier and what the user gets if they use vkCmdCopyQueryPoolResults and we should be consistent. We could make vkCmdCopyQueryPoolResults clamp but it's annoying and ends up burning extra batch for something the spec clearly doesn't require. Reviewed-By: Lionel Landwerlin <[email protected]>
* radeonsi: add new polaris12 pci idAlex Deucher2017-03-171-0/+1
| | | | | | Reviewed-by: Marek Olšák <[email protected]> Cc: 17.0 <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* gallium/radeon: formalize that create_batch_query doesn't need pipe_contextMarek Olšák2017-03-173-13/+12
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* gallium/radeon: formalize that create_query doesn't need pipe_contextMarek Olšák2017-03-173-32/+32
| | | | | | for threaded gallium Reviewed-by: Timothy Arceri <[email protected]>
* gallium/radeon: reference pipe_resource in pipe_transferMarek Olšák2017-03-172-2/+5
| | | | | | for threaded gallium Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: compile all TGSI compute shaders asynchronouslyMarek Olšák2017-03-171-44/+81
| | | | | | required by threaded gallium Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: require that compiler threads are enabledMarek Olšák2017-03-172-11/+13
| | | | | | | threaded gallium can't use pipe_context's LLVM target machine, because create_shader_selector can be called from a non-driver thread. Reviewed-by: Timothy Arceri <[email protected]>
* trace: remove leftover assertions after pipe_resource wrapping removalMarek Olšák2017-03-171-6/+0
|
* gallium/u_upload: make the first persistent mapping unsynchronizedMarek Olšák2017-03-171-0/+1
| | | | This is simpler for drivers.
* anv/device: init timestampPeriod from devinfoRobert Bragg2017-03-171-3/+1
| | | | | | | | | | Now that there's a timebase_scale in gen_device_info which is effectively the 'period' this switches anv_GetPhysicalDeviceProperties to using this common device info to initialize the timestampPeriod device limit. Signed-off-by: Robert Bragg <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Allow a per gen timebase scale factorRobert Bragg2017-03-176-27/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to Skylake the Gen HW timestamps were driven by a 12.5MHz clock with the convenient property of being able to scale by an integer (80) to nanosecond units. For Skylake the frequency is 12MHz or a scale factor of 83.333333 This updates gen_device_info to track a floating point timebase_scale factor and makes corresponding _queryobj.c changes to no longer assume a scale factor of 80 works across all gens. Although the gen6_ code could have been been left alone, the changes keep the code more comparable, and it now shares a few utility functions for scaling raw timestamps and calculating deltas. The utility for calculating deltas takes into account 32 or 36bit overflow depending on the current kernel version. Note: this leaves the timestamp handling of ARB_query_buffer_object untouched, which continues to use an incorrect scale of 80 on Skylake for now. This is more awkward to solve since the scaling is currently done using a very limited uint64 ALU available to the command parser that doesn't support multiply or divide where it's already taking a large number of instructions just to effectively multiple by 80. This fixes piglit arb_timer_query-timestamp-get on Skylake v2: (Ken) Update timebase_scale for platforms past Skylake/Broxton too. Signed-off-by: Robert Bragg <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv/device: Remove a use of a compound literalJason Ekstrand2017-03-171-1/+1
| | | | | | | | Older versions of GCC don't like compound literals in static const variable declarations because they don't think it's an actual constant value. Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: bounds checks while concatenating sysfs pathsRobert Bragg2017-03-171-11/+32
| | | | | | | | | | | | | | This adds some missing return value checks for all uses of snprintf in brw_performance_query.c. This also switches a use of strncpy + strncat for snprintf for consistency and to avoid the chance of the strncpy leaving an unterminated string in the dest buffer if the src is too long. This issue with strncpy was picked up by Coverity. CID: 1402201 Signed-off-by: Robert Bragg <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* mesa: automake: add all headers to the tarball.Emil Velikov2017-03-171-0/+2
| | | | | Fixes: d8d81fbc316 ("mesa: Add infrastructure for a worker thread to process GL commands.") Signed-off-by: Emil Velikov <[email protected]>
* mapi: automake: add all python scripts to EXTRA_DISTEmil Velikov2017-03-171-0/+3
| | | | | | | Otherwise it'll be missing in the tarball and make distcheck will fail. Fixes: 05dd4a1104e ("glapi: Generate GL API marshalling code from the XML.") Signed-off-by: Emil Velikov <[email protected]>