summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* anv: Add valid_bufer_usage to the memory type metadataJason Ekstrand2017-05-232-8/+26
| | | | | | | | | Instead of returning valid types as just a number, we now walk the list and check the buffer's usage against the usage flags we store in the new anv_memory_type structure. Currently, valid_buffer_usage == ~0. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Determine the type of mapping based on type metadataJason Ekstrand2017-05-232-7/+7
| | | | | | | | | Before, we were just comparing the type index to 0. Now we actually look the type up in the table and check its properties to determine what kind of mapping we want to do. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Set up memory types and heaps during physical device initJason Ekstrand2017-05-232-44/+77
| | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Predicate 48bit support on gen >= 8Jason Ekstrand2017-05-231-1/+6
| | | | | | | | | This doesn't matter right now since it only affects whether or not we set the kernel bit but, if we ever do anything else based on it, we'll want it to be correct per-gen. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv/image: Get rid of the memset(aux, 0, sizeof(aux)) hackJason Ekstrand2017-05-231-28/+0
| | | | | | | | | | | | | Up until now, we've been memsetting the auxiliary surface to 0 at BindImageMemory time to ensure that it is properly initialized. However, this isn't correct because apps are allowed to freely alias memory between different images and buffers so long as they properly track whether or not a particular image is valid and, if it isn't, transition from UNINITIALIZED to something else before using it. We now implement those transitions so we can drop the hack. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Handle transitioning depth from UNDEFINED to other layoutsJason Ekstrand2017-05-232-19/+19
| | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Handle color layout transitions from the UNINITIALIZED layoutJason Ekstrand2017-05-233-2/+108
| | | | | | | | This causes dEQP-VK.api.copy_and_blit.resolve_image.partial.* to start failing due to test bugs. See CL 1031 for a test fix. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* st/nine: Fix a regression and syntax cleanupAxel Davy2017-05-244-19/+16
| | | | | | | | | | | | | A few cleanups and in particular initializing properly the new pipe_draw_info fields. This should fix the regression caused by 330d0607ed60fd3edca192e54b4246310f06652f Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101088 Signed-off-by: Axel Davy <[email protected]> Tested-by: Edmondo Tommasina <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* mesa: Remove GL_APPLE_vertex_array_object stubsIan Romanick2017-05-234-26/+4
| | | | | | | | | | | | | | | | | | Mark the functions 'exec="skip"' in the XML instead. libGL will still have the functions, but the driver won't try to use them. I verified that this commit works with piglit's 'object-namespace-pollution glClear vertex-array' on x64 with a driver built from mesa-12.0.3 tag. In fairness, this test also works with a libGL built from 7927d03. I believe it continues to work because on non-Windows platforms we generate some extra, dummy dispatch functions that can be used when a driver requests a function unknown to libGL. This was done to provide some "forward" compatibility with drivers that need more functions. This doesn't work on Windows because the Windows calling convention is for the callee to clean up the stack. That's the theory anyway. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* gallium/radeon: pipe AMDGPU_INFO_NUM_VRAM_CPU_PAGE_FAULTS into gallium HUDMarek Olšák2017-05-235-2/+16
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* freedreno/ir3: switch to NIR by defaultRob Clark2017-05-232-16/+2
| | | | | | | | | | | Now that we lower vars to regs, we no longer regress for anything that does complex dereferences. (With tgsi, derefers are already lowered before tgsi_to_nir, but not with glsl_to_nir.) In fact it actually fixes a few things to bypass tgsi. So make NIR the default (finally!) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: lower arrays to regsRob Clark2017-05-232-150/+185
| | | | | | | | | | | Instead of using load/store_var intrinsics, which can have complex derefs in the case of multi-dimensional arrays, lower these to regs and handle the direct/indirect loads in get_src() and stores in put_dst(). This should let us switch to using nir by default. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add put_dst()Rob Clark2017-05-231-0/+24
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: code-motionRob Clark2017-05-231-55/+55
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix cmdline compilerRob Clark2017-05-231-2/+0
| | | | | | | | standalone_compiler_cleanup() frees the glsl types, among other things, so it needs to come after nir->ir3. But since we exit after dumping the disassembly, it is easier to just not call it at all. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add missing nir_opt_copy_prop_vars() passRob Clark2017-05-231-0/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: need different compiler options for a5xxRob Clark2017-05-234-5/+28
| | | | | | vertex_id_zero_based differs.. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: remove copapasta from a4xxRob Clark2017-05-231-2/+1
| | | | | | | Won't ever hit this w/ a420 gpu, so this is dead code. Need to get astc working to know whether to rip this out entirely or not. Signed-off-by: Rob Clark <[email protected]>
* freedreno: only support SSBOs with nirRob Clark2017-05-231-0/+3
| | | | | | | tgsi_to_nir does not support them. Note that compute shaders already force nir. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: add some missing texture formatsRob Clark2017-05-231-51/+51
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: provoking vertexRob Clark2017-05-236-44/+40
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2017-05-236-55/+64
| | | | Signed-off-by: Rob Clark <[email protected]>
* nir/lower-atomics-to-ssbo: remove atomic_uint arrays tooRob Clark2017-05-231-1/+9
| | | | | | | | Maybe there is a better way to do this. But by the time we get to assigning uniform locs, we want the atomic_uint's to all be gone, otherwise we assert in st_glsl_attrib_type_size(). Signed-off-by: Rob Clark <[email protected]>
* nir/lower-atomics-to-ssbo: fix num_componentsRob Clark2017-05-231-0/+5
| | | | | | Fixes some piglits like arb_shader_atomic_counters-active-counters Signed-off-by: Rob Clark <[email protected]>
* radeon: pass flags that can change shaders to disk_cache_create()Timothy Arceri2017-05-231-1/+2
| | | | | | | | | I wasn't sure if I should filter the flags so that we only use flags that actually change the shader output. To avoid manual updates we just pass in everything for now. Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* util/disk_cache: add new driver_flags param to cache keysTimothy Arceri2017-05-235-15/+23
| | | | | | | | | This will be used for things such as adding driver specific environment variables to the key. Allowing us to set environment vars that change the shader and not have the driver ignore them if it finds existing shaders in the cache. Reviewed-by: Eduardo Lima Mitev <[email protected]>
* u_format_test: Ignore S3TC errors.Jose Fonseca2017-05-221-0/+25
| | | | | | | | | | This prevents spurious failures when libtxc-dxtn-s2tc is installed. Note: lp_test_format doesn't need any change since we were already ignoring S3TC failures there. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Rhys Kidd <[email protected]>
* docs: Document ASTC extension support for SKL and BXTNanley Chery2017-05-221-2/+2
| | | | | | | v2: Remove the '+' after bxt Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* i965: Enable ASTC HDR for BroxtonNanley Chery2017-05-221-0/+3
| | | | | | | | This platform passes the following GLES3 tests: ES3-CTS.functional.texture.compressed.astc.endpoint_value_hdr_cem_* Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* intel/isl: Add ASTC HDR to format lists and helpersNanley Chery2017-05-223-2/+58
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* radv: Add compute HTILE fast clear.Bas Nieuwenhuizen2017-05-221-1/+93
| | | | | | | | | Not really what the fast depth clear does, no matter whether you use EXPCLEAR or not. Seems the fast clear using the DB HW always touches the main buffer. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Use correct clear words for HTILE.Bas Nieuwenhuizen2017-05-221-4/+13
| | | | | | | | | | Did some RE'ing what several HTILE words give when read from a descriptor with HTILE compression enabled. Seems to align with -pro usage for D16 too. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Add queue masks for htile usage determination.Bas Nieuwenhuizen2017-05-224-20/+41
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Specify semantics of HTILE layout helpers.Bas Nieuwenhuizen2017-05-223-3/+20
| | | | | | | And correct implementation to specify only what we support. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Don't use a separate can_expclear.Bas Nieuwenhuizen2017-05-225-40/+11
| | | | | | | We never use EXPCLEAR clears. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* mesa: GL_ARB_shader_subroutine is not optional in core profileIan Romanick2017-05-229-49/+3
| | | | | | | | | | | text data bss dec hex filename 7038459 235248 37280 7310987 6f8e8b 32-bit i965_dri.so before 7038227 235248 37280 7310755 6f8da3 32-bit i965_dri.so after 6681438 303400 50608 7035446 6b5a36 64-bit i965_dri.so before 6681254 303400 50608 7035262 6b597e 64-bit i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* drirc: Add allow_glsl_builtin_variable_redeclaration for Dead Island Riptide ↵Benedikt Schemmer2017-05-221-0/+4
| | | | | | | Definitive Edition Signed-off-by: Marek Olšák <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallium/radeon: add a query for monitoring Gallium thread loadMarek Olšák2017-05-222-0/+13
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx9: compile shaders with +xnackMarek Olšák2017-05-221-6/+7
| | | | | | | so that LLVM doesn't allocate SGPRs where XNACK is. Cc: 17.1 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* vc4: Remove dead code in vc4_dump_surface_msaa()Rhys Kidd2017-05-221-6/+0
| | | | | | | | | Coverity caught the use of dead code copy-paste for found_colors[] and num_found_colors. CID: 1341850 Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* egl/wayland: verify event queue was allocatedLionel Landwerlin2017-05-221-1/+1
| | | | | | | | | | We're already verified that 'window' wasn't NULL, I'm guessing this allocation error is about the newly created queue. CID: 1409754 Fixes: 03dd9a88b0b ("egl/wayland: Use per-surface event queues") Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* mesa: add APPLE_vertex_array_object stubsTimothy Arceri2017-05-227-1/+53
| | | | | | | | | | | APPLE_vertex_array_object support was removed in 7927d0378fc7. However it turns out we can't remove the functions because this can cause issues when libglapi is used together with DRI drivers built prior to said commit Fixes: 7927d0378fc ("mesa: drop APPLE_vertex_array_object support") Reviewed-by: Emil Velikov <[email protected]>
* glsl: set mask via initialisation list rather than in constructor bodyTimothy Arceri2017-05-221-3/+1
| | | | | | | | | Potentially more efficient as it may avoid the struct being initialised twice. Also add var to the initialisation list while we are here. Reviewed-by: Samuel Pitoiset <[email protected]>
* ralloc: Use strnlen() inside of strncat()Vladislav Egorov2017-05-221-6/+1
| | | | | | | | | | | If the str is long or isn't null-terminated, strlen() could take a lot of time or even crash. I don't know why was it used in the first place, maybe for platforms without strnlen(), but strnlen() is already used inside of ralloc_strndup(), so this change should not additionally break anything. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glcpp: Skip unnecessary line continuations removalVladislav Egorov2017-05-221-2/+8
| | | | | | | | | | Overwhelming majority of shaders don't use line continuations. In my shader-db only shaders from the Talos Principle and Serious Sam used them, less than 1% out of all shaders. Optimize for this case, don't do any copying if no line continuation was found. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glcpp: Avoid unnecessary strcmp()Vladislav Egorov2017-05-221-5/+9
| | | | | | | | strcmp() is slow. Initiate comparison with "__LINE__" or "__FILE__" only if the identifier starts with '_', which is rare. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* main: Move hashLockMutex/hashUnlockMutex to header and inlineThomas Helland2017-05-222-45/+40
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* main: Use _mesa_HashLock/UnlockMutex consistentlyThomas Helland2017-05-221-13/+10
| | | | | | | | | This is shorter and easier on the eyes. At the same time this also ensures that we are always asserting that the table pointer is not NULL. Currently that was not done for all situations. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* util: Change the pointer hashing functionThomas Helland2017-05-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use our knowledge that pointers are at least 4 byte aligned to remove the useless digits. Then shift by 6, 10, and 14 bits and add this to the original pointer, effectively folding in the entropy of the higher bits of the pointer into a 4-bit section. Stopping at 14 means we can add the entropy from 18 bits, or at least a 600Kbyte section of memory. Assuming that ralloc allocates from a linearly allocated heap less than this we can make a very efficient pointer hashing function for our usecase. Even if we are not on an architecture that is 4 byte aligned, there is still a high big chance that the thing we are allocating is at least 8 bytes in size, so even then we will have entropy into the third bit. The 4 bit increment on the shifts is chosen rather arbitrarily; if we had chosen a 3 bit increment we would need to add another xor to cover a decently sized memorypool. Increasing it to 5 bits would spread our entropy more, possibly hurting us with more collisions on hash tables of size less than 32. With a hash table of size 16 there are a max of 11 entries, and we can assume that with such a small table collisions are not that painfull. This allows us to hash the whole 32 or 64 bit pointer at once, instead of running FNV1a, looping through each byte and doing increments, decrements, muls, and xors on every byte. This cuts _mesa_hash_data from 1.5 % on profiles, to making _mesa_hash_pointer show up with a 0.09% share. Collisions on insertion actually seems to be ever so slightly lower with this hash function, as found by printing a loop counter and sorting the data. perf stat shows a 1.5% reduction in instruction count, and a 5% reduction in stalled cycles. Shader-db runtime goes from 225 to 220 seconds. No instruction-count changes in shader-db, but there are some minor changes in cycle-count that is likely caused by nir walking a set in some of its passes, and this causing a different ordering. That might eventually lead to a difference in register allocation. However, the effect is a net positive; total cycles in shared programs: 24739550 -> 24738482 (-0.00%) cycles in affected programs: 374468 -> 373400 (-0.29%) helped: 178 HURT: 49 Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vulkan/wsi/wayland: Fix proxy wrappers for swapchain recreationPhilipp Zabel2017-05-201-3/+10
| | | | | | | | | | | | | | Before the swapchain event queue is destroyed, all proxy objects that reference it must be dropped. Otherwise we risk a use-after-free if a frame callback event or buffer release events are received afterwards. This happens when an application destroys and recreates a swapchain in FIFO mode between two frames without using the VkSwapchainCreateInfoKHR::oldSwapchain mechanism to keep the old swapchain until after the next redraw. Fixes: 5034c615582a ("vulkan/wsi/wayland: Use proxy wrappers for swapchain") Signed-off-by: Philipp Zabel <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Cc: [email protected]