summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* swr/rast: Allow gather of floats from fetch shader with 2-4GB offsetsTim Rowley2017-09-062-1/+7
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* radv: fix error code when resizing the upload BOSamuel Pitoiset2017-09-061-1/+1
| | | | | | | malloc() failures are unrelated to the device memory. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* mesa/st/st_glsl_to_tgsi_temprename.cpp: Fix compilation with MSVCGert Wollny2017-09-061-1/+9
| | | | | | | | | | If <windows.h> is included then max is a macro that clashes with std::numeric_limits::max, hence undefine it. For some reason the struct access_record is not recognizes outside the anonymouse namespace, make it a class. The patch successfully was tested on AppVeyor. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/st: glsl_to_tgsi: tie in new temporary register merge approachGert Wollny2017-09-061-50/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces the old register lifetime estiamtion and rename mapping evaluation with the new one. Performance to compare between the current and the new implementation were measured by running the shader-db in one thread. ----------------------------------------------------------- old new(std::sort) ---------------- time ./run -j1 shaders -------------------- real 5.80s 5.75s user 5.75s 5.70s sys 0.05s 0.05s ---- valgrind --tool=callgrind --dump-instr=yes------------ merge 0.08% 0.18% estimate lifetime 0.02% 0.11% evaluate mapping (incl=0.3%) 0.04% apply mapping 0.03% 0.02% --- perf (approximate because of statistic sampling) ---- merge (total) 0.09% 0.16% estimate lifetime 0.03% 0.10% evaluate mapping (incl=0.02%) 0.04% apply mapping 0.04% 0.04% Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/st: glsl_to_tgsi: Add test set for evaluation of rename mappingGert Wollny2017-09-061-0/+169
| | | | | | | The patch adds tests for the register rename mapping evaluation and combined life time estimation and renaming. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/st: glsl_to_tgsi: add register rename mapping evaluatorGert Wollny2017-09-063-5/+137
| | | | | | | | | | | | | | | The remapping evaluator first sorts the temporary registers ascending based on their first life time instruction, and then uses a binary search to find merge canidates. For the initial sorting it uses std::sort because qsort is quite slow in comparison. By removing the define USE_STL_SORT in src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp one can enable the alternative code path that uses qsort. Registers that are not written to are not considered for renaming since in glsl_to_tgsi_visitor::renumber_registers they are eliminated anyway. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime trackerGert Wollny2017-09-066-4/+1483
| | | | | | This patch adds a set of unit tests for the new lifetime tracker. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/st: glsl_to_tgsi: implement new temporary register lifetime trackerGert Wollny2017-09-063-0/+943
| | | | | | | | | | | | | | | | | | This patch adds a class for tracking the life times of temporary registers in the glsl to tgsi translation. The algorithm runs in three steps: First, in order to minimize the number of needed memory allocations the program is scanned to evaluate the number of scopes. Then, the program is scanned second time to record the important register access time points: first and last reads and writes and their link to the execution scope (loop, if/else branch, switch case). In the third step for each register the actual minimal life time is evaluated. In addition, when compiled in debug mode (i.e. NDEBUG is not defined) the shaders and estimated temporary life times can be logged to stderr by setting the environment variable GLSL_TO_TGSI_RENAME_DEBUG. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/st: glsl_to_tgsi move some helper classes to extra filesGert Wollny2017-09-064-287/+368
| | | | | | | | | | | | | | | | | | | | | | | To prepare the implementation of a temp register lifetime tracker some of the classes are moved into seperate header/implementation files to make them accessible from other files. Specifically these are: class st_src_reg; class st_dst_reg; class glsl_to_tgsi_instruction; struct rename_reg_pair; int swizzle_for_type(const glsl_type *type, int component); as inline: bool is_resource_instruction(unsigned opcode); unsigned num_inst_dst_regs(const glsl_to_tgsi_instruction *op); unsigned num_inst_src_regs(const glsl_to_tgsi_instruction *op); Reviewed-by: Nicolai Hähnle <[email protected]>
* st_glsl_to_tgsi: rewrite rename registers to use array fully.Dave Airlie2017-09-061-29/+26
| | | | | | | | | | Instead of having to search the whole array, just use the whole thing and store a valid bit in there with the rename. Removes this from the profile on some of the fp64 tests Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bugNicolai Hähnle2017-09-065-24/+85
| | | | | | | | | | | | | | | | | | | When the HS wave is empty, the hardware writes the LS VGPRs starting at v0 instead of v2. Workaround by shifting them back into place when necessary. For simplicity, this is always done in the LS prolog. According to the hardware team, this will be fixed in future chips, so take that into account already. Note that this is not a bug fix, as the bug was already worked around by commit 166823bfd26 ("radeonsi/gfx9: add a temporary workaround for a tessellation driver bug"). This change merely replaces the workaround by one that should be better. v2: add workaround code to shader only when necessary v3: clarify the prefer_mono comment Reviewed-by: Marek Olšák <[email protected]>
* ac/debug: take ASIC generation into account when printing registersNicolai Hähnle2017-09-062-107/+177
| | | | | | | | | | | There were some overlapping changes in gfx9 especially in the CB/DB blocks which made register dumps rather misleading. The split is along the lines of the header files, so we'll print VI-only fields on SI and CI, for example, but we won't print GFX9 fields on SI/CI/VI, and we won't print SI/CI/VI fields on GFX9. Acked-by: Marek Olšák <[email protected]>
* amd/common: pass chip_class to ac_dump_regNicolai Hähnle2017-09-063-60/+75
| | | | Acked-by: Marek Olšák <[email protected]>
* ac/sid_tables: add FieldTable objectNicolai Hähnle2017-09-061-30/+85
| | | | | | | | Automatically re-use table entries like StringTable and IntTable do. This allows us to get rid of the "fields_owner" logic, and simplifies the next change. Acked-by: Marek Olšák <[email protected]>
* ac/sid_tables: remove unused variable varname_valuesNicolai Hähnle2017-09-061-1/+0
| | | | Acked-by: Marek Olšák <[email protected]>
* radeonsi/gfx9: always flush DB metadata on framebuffer changesNicolai Hähnle2017-09-063-4/+14
| | | | | | | This fixes GL45-CTS.shader_image_load_store.basic-glsl-earlyFragTests. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* util/ralloc: set prev-pointers correctly in ralloc_adoptNicolai Hähnle2017-09-061-1/+3
| | | | | | | | | | | | Found by inspection. I'm not aware of any actual failures caused by this, but a precise sequence of ralloc_adopt and ralloc_free should be able to cause problems. v2: make the code slightly clearer (Eric) Reviewed-by: Eric Engestrom <[email protected]>
* mesa/main: Fix GetTextureImage error reportingIago Toral Quiroga2017-09-061-1/+1
| | | | | | | | | | | | | | | | | GetTex*Image should return INVALID_ENUM if target is not valid, however, GetTextureImage does not receive a target, and instead should return INVALID_OPERATION if the effective target is not valid. From the OpenGL 4.6 core profile spec, section 8.11 Texture Queries: "An INVALID_OPERATION error is generated by GetTextureImage if the effective target is not one of TEXTURE_1D, TEXTURE_2D, TEXTURE_3D, TEXTURE_1D_ARRAY, TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY, TEXTURE_RECTANGLE, or TEXTURE_CUBE_MAP (for GetTextureImage only)." Fixes: KHR-GL45.direct_state_access.textures_image_query_errors Reviewed-by: Samuel Pitoiset <[email protected]>
* egl: remove unused 'Screens' array from _egl_displayTapani Pälli2017-09-061-1/+0
| | | | | | | | | This was used by EGL_MESA_screen_surface that has been removed in commit 7a58262e58d8edac3308777def0950032628edee. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* Revert "radv: disable support for VEGA for now."Dave Airlie2017-09-061-5/+0
| | | | | | | | | | | | This reverts commit 611076a41aac3095a82dff2432943d7f8d429822. With the two previous commits, vega shouldn't be unstable, doesn't pass CTS, but can do a complete run, and games shouldn't hang anymore, so bring it back online. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.2" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: set descriptor up for base_mip to level range.Dave Airlie2017-09-061-1/+4
| | | | | | | | | | This is required on GFX9, fixes a bug in Talos where all the mipmaps overlay each other. Just pushing this as well as it fixes Talos. Cc: "17.2" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: disable 1d/2d linear optimisation on gfx9.Dave Airlie2017-09-061-3/+4
| | | | | | | | | | | | | This causes hangs in some of the CTS tests with a 2d 1536x2 texture. This fixes hangs with: dEQP-VK.pipeline.image.suballocation.sampling_type.combined.iew_type.1d_aray.format.r4g4b4a4_unorm_pack16.count_1.size.512x1_array_of_3 if we reenable it, make sure these don't regress. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.2" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: fix buffer size on gfx9.Dave Airlie2017-09-061-1/+1
| | | | | | | | | | The VI sizing only applies to VI. This fixes: dEQP-VK.image.image_size.buffer.* Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Fix vkCopyImage with both depth and stencil aspects.Bas Nieuwenhuizen2017-09-061-99/+107
| | | | | Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>
* mesa/mtypes: repack gl_sampler_object.Dave Airlie2017-09-061-1/+1
| | | | | | | | 160->152. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/mtypes: repack gl_texture_object.Dave Airlie2017-09-061-5/+5
| | | | | | | | reduces size from 1144 to 1128. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/mtypes: repack gl_shader_program_data.Dave Airlie2017-09-061-3/+3
| | | | | | | | This reduces the size from 144 bytes to 128 bytes. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/mtypes: reorganise gl_shaderDave Airlie2017-09-061-4/+5
| | | | | | | | | | This reduces this from 200->182 bytes. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/mtypes: repack display list structs.Dave Airlie2017-09-061-3/+2
| | | | | | | | This reduces each of these by 8 bytes. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/mtypes: reduce size of gl_sync_object.Dave Airlie2017-09-061-1/+1
| | | | | | | | Drops from 40->32 bytes. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/mtypes: reorg vertex/fragment program state.Dave Airlie2017-09-061-6/+6
| | | | | | | | reduces both of these by 8 bytes. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/bindless: reorder gl_bindless_image gl_bindless_sampler.Dave Airlie2017-09-061-6/+6
| | | | | | | | This makes these use 16-bytes instead of 24-bytes. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix a memleak when compiling the GS copy shaderSamuel Pitoiset2017-09-051-0/+2
| | | | | | | | Found by inspection. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* svga: move index buffer bind flag assertionCharmaine Lee2017-09-051-3/+3
| | | | | | | | | | | The buffer bind flags can be promoted in svga_buffer_handle(), so move the assertion after it. This has already been done for vertex buffer in commit 6b4bf7e8be, but it misses the one for index buffer. Fixes assertion running WarThunder. Reviewed-by: Neha Bhende <[email protected]>
* svga: avoid emitting redundant SetShaderResources and SetVertexBuffersCharmaine Lee2017-09-052-18/+116
| | | | | | | | | | | | | Minor performance improvement in avoiding binding the same shader resource or the same vertex buffer for the same slot. Tested with MTT glretrace. v2: Per Brian's suggestion, add a helper function to do vertex buffer comparision. v3: Change the helper function to vertex_buffers_equal(). Reviewed-by: Brian Paul <[email protected]>
* spirv: Add support for the HelperInvocation builtinJason Ekstrand2017-09-051-1/+4
| | | | | | | | I have no idea how this got missed but it's been missing since forever. Cc: [email protected] Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* loader/dri3: Use client local back to front blit in copySubBuffer if availableThomas Hellstrom2017-09-051-9/+7
| | | | | | | | | | | | | The copySubBuffer functionality always attempted a server side blit from back to fake front if a fake front was present, and we weren't displaying on a remote GPU. Now that we always have local blit capability on modern drivers, first attempt a local blit, and only if that fails, try the server blit. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* radeonsi/gfx9: implement primitive binningMarek Olšák2017-09-0510-7/+489
| | | | | | | This increases performance, but it was tuned for Raven, not Vega. We don't know yet how Vega will perform, hopefully not worse. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add more state flags into si_state_dsaMarek Olšák2017-09-052-1/+23
| | | | | | | 3 flags for primitive binning, 2 flags for out-of-order rasterization (but that will be done some other time) Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: don't use BREAK_BATCH and FLUSH_DFSM if DFSM is disabledMarek Olšák2017-09-052-3/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* vbo: fix build errors on androidTapani Pälli2017-09-051-1/+1
| | | | | | | | | | | | | incompatible pointer to integer conversion assigning to 'GLintptr' (aka 'int') from 'const char *' [-Werror,-Wint-conversion] offset = indices; ^ ~~~~~~~ Fixes: 2d93b462b4d ("vbo: fix offset in minmax cache key") Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* docs: add news item and link release notes for 17.2.0Emil Velikov2017-09-042-0/+8
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: add sha256 checksums for 17.2.0Emil Velikov2017-09-041-1/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit b4473dd5191878249ccb53f40407206f1e57fa6f)
* docs: Update 17.2.0 release notesEmil Velikov2017-09-041-2/+149
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit f5925b2897308530c64e1abf44ebc1ee0e017ada)
* radeonsi: eliminate PS color outputs when colormask kills themMarek Olšák2017-09-043-0/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: sort DBG shader flags according to pipe_shader_typeMarek Olšák2017-09-044-35/+17
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: ensure cache flushes happen before SET_PREDICATION packetsNicolai Hähnle2017-09-043-9/+18
| | | | | | | | The data is read when the render_cond_atom is emitted, so we must delay emitting the atom until after the flush. Fixes: 0fe0320dc074 ("radeonsi: use optimal packet order when doing a pipeline sync") Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix ARB_transform_feedback_overflow_query on <= VINicolai Hähnle2017-09-043-1/+12
| | | | | | | | The result written by the shader workaround needs to be written back, or the CP may read stale data. Fixes: 78476cfe071a ("radeonsi: enable ARB_transform_feedback_overflow_query") Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix compute shader state dumpingNicolai Hähnle2017-09-041-6/+11
| | | | | Fixes: 420c438589c8 ("radeonsi: log draw and compute state into log context") Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add an assertion that only two-dimensional constant references are ↵Nicolai Hähnle2017-09-041-2/+3
| | | | | | | | | | used v2: remove some redundant checks Acked-by: Roland Scheidegger <[email protected]> (v1) Tested-by: Dieter Nützel <[email protected]> (v1) Reviewed-by: Timothy Arceri <[email protected]>