summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: get rid of no_{prolog,epilog}Nicolai Hähnle2016-11-032-153/+80
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: get rid of si_llvm_emit_fs_epilogueNicolai Hähnle2016-11-031-96/+1
| | | | | | It is no longer used. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: get rid of get_interp_paramNicolai Hähnle2016-11-031-52/+2
| | | | | | Replace by a simple LLVMGetParam, since ctx->no_prolog is always false. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: get rid of select_interp_paramNicolai Hähnle2016-11-031-41/+0
| | | | | | The condition !ctx->no_prolog is now always true. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use TCS epilog for monolithic shadersNicolai Hähnle2016-11-031-1/+21
| | | | | | | For fixed function TCS, we keep the copying of VS outputs to TES inputs inside the main function; the call to si_copy_tcs_inputs is moved accordingly. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: extract si_build_tcs_epilog_functionNicolai Hähnle2016-11-031-33/+46
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use VS epilog for monolithic TESNicolai Hähnle2016-11-031-0/+13
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use VS prolog and epilog for monolithic shadersNicolai Hähnle2016-11-031-2/+33
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: extract si_build_vs_{prolog,epilog}_functionNicolai Hähnle2016-11-031-67/+115
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use PS prolog for monolithic shadersNicolai Hähnle2016-11-031-10/+32
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: set num_input_vgprs for fragment shaders in create_functionNicolai Hähnle2016-11-031-6/+11
| | | | | | | So that the prolog generated for monolithic fragment shaders will have the right signature. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: extract si_build_ps_prolog_functionNicolai Hähnle2016-11-031-139/+171
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use PS epilog for monolithic shadersNicolai Hähnle2016-11-031-0/+207
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: extract si_build_ps_epilog_functionNicolai Hähnle2016-11-031-35/+60
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass the function name to si_llvm_create_funcNicolai Hähnle2016-11-033-8/+11
| | | | | | | We will use multiple functions in one module, so they should have different names. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: split is_monolithic into no_prolog and no_epilogNicolai Hähnle2016-11-032-17/+33
| | | | | | | | | | This helps to achieve a gradual transition towards building monolithic shaders via inlining. no_prolog and no_epilog will be removed by the end of the series, separate_prolog remains in use to control the PS input mapping. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: free data structures when shader compiles failNicolai Hähnle2016-11-031-11/+11
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move main TGSI translation into its own functionNicolai Hähnle2016-11-031-45/+58
| | | | | | | The idea is that adding prolog and epilog code will be pulled out into the caller. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add always-inline pass to si_llvm_finalize_moduleNicolai Hähnle2016-11-031-5/+5
| | | | | | | Change the pass manager as well, since this is a module-level pass. No noticeable run-time difference on shader-db. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix signature of export intrinsic in VS epilogNicolai Hähnle2016-11-031-3/+3
| | | | | | | The incompatible signature becomes an issue when the VS epilog gets merged with the main vertex shader at the IR level. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: link against amd_commonNicolai Hähnle2016-11-031-0/+1
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nv50,nvc0: stop limiting the number of active queries to 1Samuel Pitoiset2016-11-022-16/+12
| | | | | | | | | | | | | | This limitation was initially here because AMD_performance_monitor doesn't allow to expose the real number of hardware counters. But this actually really annoying when profiling with qapitrace. Anyways, performance counters are mostly for developers and failures are expected if you try to monitor more queries than supported. This breaks amd_performance_monitor_measure but it's expected. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: add new warp_nonpred_execution_efficiency metric on SM35Samuel Pitoiset2016-11-022-1/+37
| | | | | | | Event not_predicated_off_thread_inst_executed is SM35+. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add missing metric-issue_slot on SM35Samuel Pitoiset2016-11-021-0/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: do not expose metric-inst_issued twice on SM35Samuel Pitoiset2016-11-021-1/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add new warp_execution_efficiency metric on SM30+Samuel Pitoiset2016-11-022-0/+24
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: respect 80-chars for perf metrics descriptionsSamuel Pitoiset2016-11-021-4/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: sort performance metrics alphabeticallySamuel Pitoiset2016-11-021-4/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50: add missing draw_calls_indexed driver statSamuel Pitoiset2016-11-021-0/+1
| | | | | | | Spotted when glancing at the VBO push code. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* radeonsi: fix BFE/BFI lowering for GLSL semanticsNicolai Hähnle2016-11-021-3/+34
| | | | | | | Fixes spec/arb_gpu_shader5/execution/built-in-functions/*-bitfield{Extract,Insert} Cc: 13.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add enum radeon_micro_modeMarek Olšák2016-11-013-7/+14
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: make it clear that DRM 2.x.x fast clear constraint is CIK-onlyMarek Olšák2016-11-011-2/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove r600_surface::level_infoMarek Olšák2016-11-013-7/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add radeon_surf::is_linearMarek Olšák2016-11-016-13/+13
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove radeon_surf_level::pitch_bytesMarek Olšák2016-11-0111-35/+37
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: don't call u_format helpers if we have that info alreadyMarek Olšák2016-11-012-10/+8
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: replace radeon_surf_info::dcc_enabled with num_dcc_levelsMarek Olšák2016-11-015-13/+17
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a driver query for counting CP DMA callsMarek Olšák2016-11-014-0/+13
| | | | | | | CP DMA calls are synchronous with regard to shaders, but can be made asynchronous if needed. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a driver query for shader cache hitsMarek Olšák2016-11-014-1/+16
| | | | | | This is an 8-month old patch. Reviewed-by: Nicolai Hähnle <[email protected]>
* nvc0: do not duplicate similar performance metricsSamuel Pitoiset2016-11-011-43/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]>
* swr: [rasterizer] added EventHandlerFile contructorGeorge Kyriazis2016-10-311-1/+6
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Frontend dependency workGeorge Kyriazis2016-10-313-2/+18
| | | | | | | Add frontend dependency concept in the DRAW_CONTEXT, which allows serialization of frontend work if necessary. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Refactor/cleanup backendsGeorge Kyriazis2016-10-312-360/+351
| | | | | | Used for common code reuse and simplification Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Remove deprecated simd intrinsicsGeorge Kyriazis2016-10-314-990/+1
| | | | | | Used in abandoned all-or-nothing approach to converting to AVX512 Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] Add thread tags to event files.George Kyriazis2016-10-315-4/+24
| | | | | | | | This allows the post-processor to easily detect the API thread and to process frame information. The frame information is needed to optimized how data is processed from worker threads. Reviewed-by: Bruce Cherniak <[email protected]>
* ralloc: use rzalloc where it's necessaryMarek Olšák2016-10-313-3/+3
| | | | | | | | | | | | | | | | | No change in behavior. ralloc_size is equivalent to rzalloc_size. That will change though. Calls not switched to rzalloc_size: - ralloc_vasprintf - glsl_type::name allocation (it's filled with snprintf) - C++ classes where valgrind didn't show uninitialized values I switched most of non-glsl stuff to rzalloc without checking whether it's really needed. Reviewed-by: Edward O'Callaghan <[email protected]> Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix behavior of GLSL findLSB(0)Marek Olšák2016-10-291-4/+13
| | | | | | | 12.0 and older need the same fix but elsewhere. Cc: 13.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set VGT_GS_ONCHIP_CNTL on CIK and laterMarek Olšák2016-10-291-0/+8
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Cc: 11.2 12.0 13.0 <[email protected]>
* nvc0/ir: fix emission of IMAD with NEG modifiersSamuel Pitoiset2016-10-272-2/+2
| | | | | | | | | The emitter tried to emit sub instead of subr when src0 has actually a NEG modifier. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.0 12.0 13.0" <[email protected]>
* nvc0/ir: fix emission of SHLADD with NEG modifiersSamuel Pitoiset2016-10-262-2/+2
| | | | | | | | | | | | | This affects GF100:GK110 chipsets, but not GM107+ where the logic is a bit different. The emitters tried to emit sub instead of subr when src0 has a NEG modifier. This fixes the following piglit tests glsl-fs-loop-nested and glsl-vs-loop-nested. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Cc: "13.0" <[email protected]>