summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* llvmpipe, tgsi: hook up dx10 gather4 opcodeRoland Scheidegger2017-09-072-8/+25
| | | | | | | | | Trivial. We already support tg4 for legacy tex opcodes, so the actual texture sampling code already handles it. (Just like TG4, we don't handle additional capabilities and always sample red channel.) Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe, draw: increase shader cache limitsRoland Scheidegger2017-09-072-4/+2
| | | | | | | | | | | | | | | We're not particularly concerned with memory usage, if the tradeoff is shader recompiles. And it's common for apps to have a lot of shaders nowadays (and, since our shaders include a LOT of context state of course we may create quite a bit more shaders even). So quadruple the amount of shaders draw will cache (from 128 to 512). For llvmpipe (fs shaders) quadruple the number of instructions, keep the number of variants the same for now (only with very simple, non-texturing shaders the variant limit could really be reached), and simplify the definition, it's probably easier to just have one different definition per branch... Reviewed-by: Jose Fonseca <[email protected]>
* ac/surface: reduce gfx9_surface_layout size.Dave Airlie2017-09-071-2/+3
| | | | | | 152->144. Signed-off-by: Dave Airlie <[email protected]>
* radv: reduce radv_amdgpu_winsys struct size.Dave Airlie2017-09-071-3/+3
| | | | | | 1168->1160. Signed-off-by: Dave Airlie <[email protected]>
* radv: reduce radv_image struct size.Dave Airlie2017-09-071-3/+2
| | | | | | 1480->1472. Signed-off-by: Dave Airlie <[email protected]>
* radv: reduce radv_shader_variant struct size.Dave Airlie2017-09-071-1/+1
| | | | | | 544->536 Signed-off-by: Dave Airlie <[email protected]>
* radv: reduce radv_cmd_state struct size.Dave Airlie2017-09-071-2/+2
| | | | | | 1632->1624. Signed-off-by: Dave Airlie <[email protected]>
* radv: reduce meta_saved_state struct size.Dave Airlie2017-09-071-4/+3
| | | | | | 904->896. Signed-off-by: Dave Airlie <[email protected]>
* nir: put compact into bitfields in nir_variable_dataDave Airlie2017-09-071-1/+1
| | | | | | | | | | This being declared bool means it won't get merged with the previous bitfields, this seems like an oversight rather than deliberate. Noticed when running pahole. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* anv: Annotate entrypoint table with index and func nameChad Versace2017-09-061-2/+2
| | | | | | This helps when debugging a broken entrypoint table. Reviewed-by: Jason Ekstrand <[email protected]>
* radeon/uvd: fix the assertion check for YUYV formatLeo Liu2017-09-061-3/+5
| | | | | | | Fixes:7319ff87("radeon/uvd: add YUYV format support for target buffer") Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]>
* intel: Add brand string for KBL-RAnuj Phogat2017-09-061-1/+1
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel: Remove unused device info for KBL GT1.5Anuj Phogat2017-09-061-11/+0
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel: Change a KBL pci id to GT2 from GT1.5Anuj Phogat2017-09-061-1/+1
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel: Fix few KBL brand stringsAnuj Phogat2017-09-061-2/+2
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel: Remove unused Kabylake pci idsAnuj Phogat2017-09-061-7/+0
| | | | | | | These PCI IDs are not used in any Kabylake SKUs. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* Revert "Android: add -Wno-date-time flag for clang"Emil Velikov2017-09-061-1/+0
| | | | | | | | This reverts commit 6dae9176d60d12de61aa03906c44f81e20ef7622. No longer needed as of last commit. Cc: Rob Herring <[email protected]>
* mesa: replace date/time macros with MESA_GIT_SHA1Emil Velikov2017-09-061-3/+7
| | | | | | | | | | Former is non-deterministic, results in non-reproducible builds and compilers throw a warning about it. Cc: Rob Herring <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* mesa: don't use %s for PACKAGE_VERSION macroEmil Velikov2017-09-062-4/+4
| | | | | | | | | | | | The macro itself is a well defined string, which cannot cause issues with printf or other printf-like functions. All other places through Mesa already use it directly, so let's update the final two instances. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* docs/release-calendar: update and extendEmil Velikov2017-09-061-16/+16
| | | | | | | | | | | v2: Correct 17.1.10 version, adjust some names. v3: Add missing <tr> (Andres) Cc: Juan A. Suárez <[email protected]> Cc: Andres Gomez <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Andres Gomez <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v1)
* docs/releasing: polish LLVM_CONFIG wording/handlingEmil Velikov2017-09-061-5/+8
| | | | | | | | | | | | | Use consistent way to manage "non-default" llvm installations, clearly documenting it. AKA, use LLVM_CONFIG throughout and unset for the Windows/mingw builds. v2: unset the save_ variable (Andres) Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Andres Gomez <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v1)
* docs/releasing: remove -jX instancesEmil Velikov2017-09-061-2/+3
| | | | | | | | | One can control the number of jobs via MAKEFLAGS. As such there's little reason to set the number of jobs for each make invocation. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Andres Gomez <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* .gitignore: list *.orig and *.rejEmil Velikov2017-09-061-0/+2
| | | | | | | | Should prevent accidental check-in of patch artefacts. Suggested-by: Mike Lothian <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* egl/x11: advertise __DRI_USE_INVALIDATE for DRI2Emil Velikov2017-09-061-0/+1
| | | | | | | | | | | | | | | | | | | Back in 2012 (commit 1e7776ca2bc - egl: Remove bogus invalidate code.) the loader use of invalidate() was purged as "bogus". One of the factors defining that statement was the lack of the loader-side invalidate extension - __DRI_USE_INVALIDATE. Since then the commit was reverted (commit eed0a80137d - egl: Restore "bogus" DRI2 invalidate event code.), always performing the driver invalidate call, although the loader was never updated to expose the extension. Do so allowing the driver to do fine grained tuning. Cc: Eric Anholt <[email protected]> Cc: Kenneth Graunke <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]
* egl/x11/dri3: adding missing __DRI_BACKGROUND_CALLABLE extensionEmil Velikov2017-09-061-0/+1
| | | | | | | Fixes: 3b7b6adf3ac ("egl: Implement __DRI_BACKGROUND_CALLABLE") Cc: Timothy Arceri <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965: expose RGBA visuals only on AndroidEmil Velikov2017-09-061-1/+22
| | | | | | | | | | | | | | | | | | | | | As Marek pointed out in earlier commit - exposing RGBA on other platforms introduces ~500 Visuals, which are not tested. Note that this does not quite happen, yet. Reason being that the GLX code does not check the masks - see scaralEqual(). Thus as we fix that, we'll run into the issue described. v2: Rebase, while keeping loaderPrivate v3: Beef-up commit message, getCapability() returns unsigned (Tapani) Fixes: 1bf703e4ea5 ("dri_interface,egl,gallium: only expose RGBA visuals on Android") Cc: Tomasz Figa <[email protected]> Cc: Chad Versace <[email protected]> Cc: Marek Olšák <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* swr/rast: FE/Clipper - unify SIMD8/16 functions using simdlib typesTim Rowley2017-09-063-1189/+446
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Remove use of C++14 template variableTim Rowley2017-09-062-6/+14
| | | | | | SWR rasterizer must remain C++11 compliant. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: SIMD16 FE remove templated immediates workaroundTim Rowley2017-09-061-90/+20
| | | | | | Fixed properly in gcc-compatible fashion. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: SIMD16 PA - rename Assemble_simd16 to AssembleTim Rowley2017-09-063-31/+15
| | | | | | For consistency and to support overloading. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: FE/Binner - unify SIMD8/16 functions using simdlib typesTim Rowley2017-09-065-1739/+696
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Removed some trailing whitespace caught during reviewTim Rowley2017-09-063-10/+10
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: set caps for VB 4-byte alignmentTim Rowley2017-09-061-3/+6
| | | | | | | | | | Needed to compensate for change to fetch jit requiring alignment. Fixes regressions in piglit: vertex-buffer-offsets and about another hundred of the vs-input*byte* tests. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Allow gather of floats from fetch shader with 2-4GB offsetsTim Rowley2017-09-062-1/+7
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* radv: fix error code when resizing the upload BOSamuel Pitoiset2017-09-061-1/+1
| | | | | | | malloc() failures are unrelated to the device memory. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* mesa/st/st_glsl_to_tgsi_temprename.cpp: Fix compilation with MSVCGert Wollny2017-09-061-1/+9
| | | | | | | | | | If <windows.h> is included then max is a macro that clashes with std::numeric_limits::max, hence undefine it. For some reason the struct access_record is not recognizes outside the anonymouse namespace, make it a class. The patch successfully was tested on AppVeyor. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/st: glsl_to_tgsi: tie in new temporary register merge approachGert Wollny2017-09-061-50/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces the old register lifetime estiamtion and rename mapping evaluation with the new one. Performance to compare between the current and the new implementation were measured by running the shader-db in one thread. ----------------------------------------------------------- old new(std::sort) ---------------- time ./run -j1 shaders -------------------- real 5.80s 5.75s user 5.75s 5.70s sys 0.05s 0.05s ---- valgrind --tool=callgrind --dump-instr=yes------------ merge 0.08% 0.18% estimate lifetime 0.02% 0.11% evaluate mapping (incl=0.3%) 0.04% apply mapping 0.03% 0.02% --- perf (approximate because of statistic sampling) ---- merge (total) 0.09% 0.16% estimate lifetime 0.03% 0.10% evaluate mapping (incl=0.02%) 0.04% apply mapping 0.04% 0.04% Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/st: glsl_to_tgsi: Add test set for evaluation of rename mappingGert Wollny2017-09-061-0/+169
| | | | | | | The patch adds tests for the register rename mapping evaluation and combined life time estimation and renaming. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/st: glsl_to_tgsi: add register rename mapping evaluatorGert Wollny2017-09-063-5/+137
| | | | | | | | | | | | | | | The remapping evaluator first sorts the temporary registers ascending based on their first life time instruction, and then uses a binary search to find merge canidates. For the initial sorting it uses std::sort because qsort is quite slow in comparison. By removing the define USE_STL_SORT in src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp one can enable the alternative code path that uses qsort. Registers that are not written to are not considered for renaming since in glsl_to_tgsi_visitor::renumber_registers they are eliminated anyway. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime trackerGert Wollny2017-09-066-4/+1483
| | | | | | This patch adds a set of unit tests for the new lifetime tracker. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/st: glsl_to_tgsi: implement new temporary register lifetime trackerGert Wollny2017-09-063-0/+943
| | | | | | | | | | | | | | | | | | This patch adds a class for tracking the life times of temporary registers in the glsl to tgsi translation. The algorithm runs in three steps: First, in order to minimize the number of needed memory allocations the program is scanned to evaluate the number of scopes. Then, the program is scanned second time to record the important register access time points: first and last reads and writes and their link to the execution scope (loop, if/else branch, switch case). In the third step for each register the actual minimal life time is evaluated. In addition, when compiled in debug mode (i.e. NDEBUG is not defined) the shaders and estimated temporary life times can be logged to stderr by setting the environment variable GLSL_TO_TGSI_RENAME_DEBUG. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/st: glsl_to_tgsi move some helper classes to extra filesGert Wollny2017-09-064-287/+368
| | | | | | | | | | | | | | | | | | | | | | | To prepare the implementation of a temp register lifetime tracker some of the classes are moved into seperate header/implementation files to make them accessible from other files. Specifically these are: class st_src_reg; class st_dst_reg; class glsl_to_tgsi_instruction; struct rename_reg_pair; int swizzle_for_type(const glsl_type *type, int component); as inline: bool is_resource_instruction(unsigned opcode); unsigned num_inst_dst_regs(const glsl_to_tgsi_instruction *op); unsigned num_inst_src_regs(const glsl_to_tgsi_instruction *op); Reviewed-by: Nicolai Hähnle <[email protected]>
* st_glsl_to_tgsi: rewrite rename registers to use array fully.Dave Airlie2017-09-061-29/+26
| | | | | | | | | | Instead of having to search the whole array, just use the whole thing and store a valid bit in there with the rename. Removes this from the profile on some of the fp64 tests Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bugNicolai Hähnle2017-09-065-24/+85
| | | | | | | | | | | | | | | | | | | When the HS wave is empty, the hardware writes the LS VGPRs starting at v0 instead of v2. Workaround by shifting them back into place when necessary. For simplicity, this is always done in the LS prolog. According to the hardware team, this will be fixed in future chips, so take that into account already. Note that this is not a bug fix, as the bug was already worked around by commit 166823bfd26 ("radeonsi/gfx9: add a temporary workaround for a tessellation driver bug"). This change merely replaces the workaround by one that should be better. v2: add workaround code to shader only when necessary v3: clarify the prefer_mono comment Reviewed-by: Marek Olšák <[email protected]>
* ac/debug: take ASIC generation into account when printing registersNicolai Hähnle2017-09-062-107/+177
| | | | | | | | | | | There were some overlapping changes in gfx9 especially in the CB/DB blocks which made register dumps rather misleading. The split is along the lines of the header files, so we'll print VI-only fields on SI and CI, for example, but we won't print GFX9 fields on SI/CI/VI, and we won't print SI/CI/VI fields on GFX9. Acked-by: Marek Olšák <[email protected]>
* amd/common: pass chip_class to ac_dump_regNicolai Hähnle2017-09-063-60/+75
| | | | Acked-by: Marek Olšák <[email protected]>
* ac/sid_tables: add FieldTable objectNicolai Hähnle2017-09-061-30/+85
| | | | | | | | Automatically re-use table entries like StringTable and IntTable do. This allows us to get rid of the "fields_owner" logic, and simplifies the next change. Acked-by: Marek Olšák <[email protected]>
* ac/sid_tables: remove unused variable varname_valuesNicolai Hähnle2017-09-061-1/+0
| | | | Acked-by: Marek Olšák <[email protected]>
* radeonsi/gfx9: always flush DB metadata on framebuffer changesNicolai Hähnle2017-09-063-4/+14
| | | | | | | This fixes GL45-CTS.shader_image_load_store.basic-glsl-earlyFragTests. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* util/ralloc: set prev-pointers correctly in ralloc_adoptNicolai Hähnle2017-09-061-1/+3
| | | | | | | | | | | | Found by inspection. I'm not aware of any actual failures caused by this, but a precise sequence of ralloc_adopt and ralloc_free should be able to cause problems. v2: make the code slightly clearer (Eric) Reviewed-by: Eric Engestrom <[email protected]>