summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* winsys/radeon: drop support for DRM 2.12.0 (kernel < 3.2)Marek Olšák2016-03-011-15/+1
| | | | | | | | | | | | | | in order to make some winsys interface changes easier This distros should use new DRM if they want to use new Mesa: Distro kernel mesa eol SLES 10 2.6.16 6.4.2 2016-07 SLED 11 3.0 9.0.3 2022-03 RHEL 5 2.6.18 6.5.1 2017-03 RHEL 6 2.6.32 10.4.3 2020-11 Debian 6 2.6.32 7.7.1 2016-02 Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: also dump shaders on a VM faultMarek Olšák2016-03-011-2/+1
| | | | Reviewed-by: Christian König <[email protected]>
* radeonsi: dump full shader disassemblies into ddebug logsMarek Olšák2016-03-011-9/+9
| | | | | | including prolog and epilog disassemblies Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: allow dumping shader disassemblies to a fileMarek Olšák2016-03-013-22/+29
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: use re-ZMarek Olšák2016-03-013-6/+21
| | | | | | | | | This can increase perf for shaders that kill pixels (kill, alpha-test, alpha-to-coverage). v2: add comments Reviewed-by: Michel Dänzer <[email protected]>
* r600: Make enum alu_op_flags unsignedRob Herring2016-02-291-8/+8
| | | | | | | | | | | | | | | | | | In builds with clang, there are several errors related to the enum alu_op_flags like this: src/gallium/drivers/r600/sb/sb_expr.cpp:887:8: error: case value evaluates to -1610612736, which cannot be narrowed to type 'unsigned int' [-Wc++11-narrowing] These are due to the MSB being set in the enum. Fix these errors by making the enum values unsigned as needed. The flags field that stores this enum also needs to be unsigned. Cc: "11.1 11.2" <[email protected]> Cc: Marek Olšák <[email protected]> Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: Add space between string literal and identifierRob Herring2016-02-291-1/+1
| | | | | | | | | | | | | Fix compiles with clang that have this C++11 error: src/gallium/drivers/radeon/r600_pipe_common.h:662:34: error: invalid suffix on literal; C++11 requires a space between literal and identifier [-Wreserved-user-defined-literal] Cc: "11.1 11.2" <[email protected]> Cc: Marek Olšák <[email protected]> Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* freedreno: drop unnecessary -Wno-packed-bitfield-compatRob Herring2016-02-291-2/+2
| | | | | | | | | | | | Enabling this warning doesn't generate any warnings with gcc, but is an unknown option for clang, so drop it. Signed-off-by: Rob Herring <[email protected]> Acked-by: Rob Clark <[email protected]> (v1) Cc: "11.1 11.2" <[email protected]> v2: keep the warning around, commented out Signed-off-by: Emil Velikov <[email protected]>
* Android: fix build break from nir/glsl move to compiler/Rob Herring2016-02-293-4/+7
| | | | | | | | | | | | | | | | Commits a39a8fbbaa12 ("nir: move to compiler/") and eb63640c1d38 ("glsl: move to compiler/") broke Android builds. Fix them. There is also a missing dependency between generated NIR headers and several libraries. This isn't a new issue, but seems to have been exposed by the NIR move. Built with i915, i965, freedreno, r300g, r600g, vc4, and virgl enabled. Cc: "11.2" <[email protected]> Cc: Mauro Rossi <[email protected]> Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium/radeon: disable evergreen_do_fast_color_clear for BEOded Gabbay2016-02-291-0/+5
| | | | | | | | | | | | | | | | This function is currently broken for BE. I assume it's because of util_pack_color(). Until I fix this path, I prefer to disable it so users would be able to see correct colors on their desktop and applications. Together with the two following patches: - gallium/r600: Don't let h/w do endian swap for colorformat - gallium/radeon: remove separate BE path in r600_translate_colorswap it fixes BZ#72877 and BZ#92039 Signed-off-by: Oded Gabbay <[email protected]> Cc: "11.1 11.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/r600: Don't let h/w do endian swap for colorformatOded Gabbay2016-02-291-0/+7
| | | | | | | | | | | | | | Since the rework on gallium pipe formats, there is no more need to do endian swap of the colorformat in the h/w, because the conversion between mesa format and gallium (pipe) format takes endianess into account (see the big #if in p_format.h). v2: return ENDIAN_NONE only for four 8-bits components (V_0280A0_COLOR_8_8_8_8) Signed-off-by: Oded Gabbay <[email protected]> Cc: "11.1 11.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: remove separate BE path in r600_translate_colorswapOded Gabbay2016-02-291-12/+1
| | | | | | | | | | | | | | | | | | | | After further testing, it appears there is no need for separate BE path in r600_translate_colorswap() The only fix remaining is the change of the last if statement, in the 4 channels case. Originally, it contained an invalid swizzle configuration that never got hit, in LE or BE. So the fix is relevant for both systems. This patch adds an additional 120 available visuals for LE and BE, as seen in glxinfo v2: Tested for regressions by running piglit gpu.py with CAICOS (r600g) on x86-64 machine. No regressions found. Signed-off-by: Oded Gabbay <[email protected]> Cc: "11.1 11.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nv50/ir: emit VOTE instructionSamuel Pitoiset2016-02-286-0/+83
| | | | | | | | Changes from v2: - add missing NOT modifier for GK110/GM107 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gk110/ir: add ld lock/st unlock emissionSamuel Pitoiset2016-02-281-2/+28
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: bump minimum texture buffer offset alignmentIlia Mirkin2016-02-272-2/+2
| | | | | | | | It appears that it actually needs to be aligned to the datum size, so it was 1 when testing with R8, but it can be as high as 16 with RGBA32. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nvc0: rework nvc0_compute_validate_program()Samuel Pitoiset2016-02-266-44/+20
| | | | | | | | | | Reduce the amount of duplicated code by re-using nvc0_program_validate(). While we are at it, change the prototype to return void and remove nvc0_compute.h which is now useless. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Pierre Moreau <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: make sure to validate compute global buffers on FermiSamuel Pitoiset2016-02-261-1/+3
| | | | | | | | | No reason to not validate those global buffers and this might avoid fails if someone try to use the global memory from compute programs. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Pierre Moreau <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: move nvc0_validate_global_residents() to nvc0_compute.cSamuel Pitoiset2016-02-264-19/+17
| | | | | | | | | While we are at it, rename it to nvc0_compute_validate_globals() and update its prototype. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Pierre Moreau <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* virgl: add missing CAP turned off.Dave Airlie2016-02-261-0/+3
|
* gallium/radeon: return correct values for BE in r600_translate_colorswapOded Gabbay2016-02-251-4/+4
| | | | | | | | | | | | | | | Because I changed the swizzle check, I also need to adapt the return values for each check. It's basically almost the same as before, we just cross between STD and STD_REV, and cross between ALT and ALT_REV This fixes the rgba test in gl-1.0-readpixsanity (piglit) and also fixes tri-flat (mesa demos). Signed-off-by: Oded Gabbay <[email protected]> Cc: "11.1 11.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: Correctly translate colorswaps for big endianOded Gabbay2016-02-231-0/+11
| | | | | | | | | | | | | | | | | | | | | | The current code in r600_translate_colorswap uses the swizzle information to determine which colorswap to use. This works for BE & LE when the nr_channels is <4, but when nr_channels==4 (e.g. PIPE_FORMAT_A8R8G8B8_UNORM), this method can not be used for both BE and LE, because the swizzle info is the same for both of them. As a result, r600g doesn't support 24bit color formats, only 16bit, which forces the user to choose 16bit color in X server. This patch fixes this bug by separating the checks for LE and BE and adapting the swizzle conditions in the BE part of the checks. Tested on an Evergreen GPU (Cedar GL FirePro 2270) running inside POWER7 Big-Endian Machine. Signed-off-by: Oded Gabbay <[email protected]> CC: "11.2" "11.1" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0: rename 3d binding points to NVC0_BIND_3D_XXXSamuel Pitoiset2016-02-229-63/+64
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: rename 3d dirty flags to NVC0_NEW_3D_XXXSamuel Pitoiset2016-02-228-133/+133
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: prefix compute macros with _CP_ instead of _COMPUTE_Samuel Pitoiset2016-02-224-4/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: rename NVXX_COMPUTE to NVXX_CPSamuel Pitoiset2016-02-225-117/+117
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: rename nvc0_context::dirty to nvc0_context::dirty_3dSamuel Pitoiset2016-02-228-64/+64
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add missing emission of locked load predicateSamuel Pitoiset2016-02-221-0/+7
| | | | | | | | | Like unlocked store on shared memory, locked store can fail and the second dest which is a predicate must be emitted. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nvc0/ir: add ld lock/st unlock emission on GK104Samuel Pitoiset2016-02-221-10/+25
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: restore OP_SELP to be a regular instructionSamuel Pitoiset2016-02-224-14/+14
| | | | | | | | | | | Actually OP_SELP doesn't need to be a compare instruction. Instead we just need to set the NOT modifier when building the instruction. While we are at it, fix the dst register type and use a GPR. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* svga: unbind index buffer when drawing non-indexed primitivesBrian Paul2016-02-221-0/+10
| | | | | | | | | Silences a warning reported by the svga3d device. v2: also null-out the index buffer pointer Reviewed-by: Sinclair Yeh <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* nouveau: update the Makefile.sources list11.2-branchpointEmil Velikov2016-02-221-2/+3
| | | | | | Reflect the nv50->g80 change and the new gm107_texture header. Signed-off-by: Emil Velikov <[email protected]>
* radeonsi: implement binary shaders & shader cache in memory (v2)Marek Olšák2016-02-215-7/+259
| | | | | | | v2: handle _mesa_hash_table_insert failure other cosmetic changes Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove unused radeon_shader_binary_free_* functionsMarek Olšák2016-02-212-33/+0
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: make radeon_shader_reloc name string fixed-sizedMarek Olšák2016-02-212-6/+3
| | | | | | This will simplify implementations of binary shaders. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move some struct si_shader members to new struct si_shader_infoMarek Olšák2016-02-213-68/+71
| | | | | | This will be part of shader binaries. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use smaller types for some si_shader membersMarek Olšák2016-02-212-3/+8
| | | | | | | | in order to decrease the shader size for a shader cache. v2: add & use SI_MAX_VS_OUTPUTS Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: enable compiling one variant per shaderMarek Olšák2016-02-213-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Shader stats from VERDE: Default scheduler: Totals: SGPRS: 491272 -> 488672 (-0.53 %) VGPRS: 289980 -> 311093 (7.28 %) Code Size: 11091656 -> 11219948 (1.16 %) bytes LDS: 97 -> 97 (0.00 %) blocks Scratch: 1732608 -> 2246656 (29.67 %) bytes per wave Max Waves: 78063 -> 77352 (-0.91 %) Wait states: 0 -> 0 (0.00 %) Looking at some of the worst regressions, I get: - The VGPR increase seems to be caused by the fact that if PS has used less than 16 VGPRs, now it will always use 16 VGPRs and sometimes even 20. However, the wave count remains at 10 if VGPRs <= 24, so no harm there. - The scratch increase seems to be caused by SGPR spilling. The unnecessary SGPR spilling has been an ongoing issue with the compiler and it's completely fixable by rematerializing s_loads or reordering instructions. SI scheduler: Totals: SGPRS: 374848 -> 374576 (-0.07 %) VGPRS: 284456 -> 307515 (8.11 %) Code Size: 11433068 -> 11535452 (0.90 %) bytes LDS: 97 -> 97 (0.00 %) blocks Scratch: 509952 -> 522240 (2.41 %) bytes per wave Max Waves: 79456 -> 78217 (-1.56 %) Wait states: 0 -> 0 (0.00 %) VGPRs - same story as before. The SI scheduler doesn't spill SGPRs so much and generally spills way less than the default scheduler. (522240 spills vs 2246656 spills) Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: print full shader name before disassemblyMarek Olšák2016-02-211-1/+33
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: compile non-GS middle parts of shaders immediately if enabledMarek Olšák2016-02-213-18/+87
| | | | | | | | | | | | | | Still disabled. Only prologs & epilogs are compiled in draw calls, but each variant of those is compiled only once per process. VS is always compiled as hw VS. TES is always compiled as hw VS. LS and ES stages are always compiled on demand. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: rework polygon stippling for PS prologMarek Olšák2016-02-211-39/+110
| | | | | | Don't use the pstipple module. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add PS prologMarek Olšák2016-02-215-2/+345
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add PS epilogMarek Olšák2016-02-214-2/+297
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add TCS epilogMarek Olšák2016-02-214-13/+155
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add VS epilogMarek Olšák2016-02-214-11/+171
| | | | | | | | | It only exports the primitive ID. Also used by TES when it's compiled as VS. The VS input location of the primitive ID input is v2. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add VS prologMarek Olšák2016-02-214-1/+267
| | | | | | This is disabled with use_monolithic_shaders = true. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: first bits for non-monolithic shadersMarek Olšák2016-02-214-14/+45
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add code for dumping all shader parts together (v2)Marek Olšák2016-02-211-12/+34
| | | | | | v2: unify some code into si_get_shader_binary_size Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add code for combining and uploading shaders from 3 shader partsMarek Olšák2016-02-212-8/+36
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fail compilation if non-GS non-CS shaders have rodataMarek Olšák2016-02-211-0/+13
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: separate 2 pieces of code from create_functionMarek Olšák2016-02-211-31/+51
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>