summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: make LLVM IR dumping less messyMarek Olšák2016-02-093-9/+15
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move a few r600_can_dump_shader calls to where they're neededMarek Olšák2016-02-091-5/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove useless code that handles dx10_clamp_modeMarek Olšák2016-02-093-14/+6
| | | | | | | "enable-no-nans-fp-math" is a wrong string and there was a disagreement about fixing it. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: dump SPI_PS_INPUT values along with shader statsMarek Olšák2016-02-091-0/+7
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: read SPI_PS_INPUT_ADDR from LLVM if it returns itMarek Olšák2016-02-093-2/+7
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't force gl_SampleMaskIn to 1 for smoothingMarek Olšák2016-02-091-7/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: split PS input interpolation code into its own functionMarek Olšák2016-02-091-56/+71
| | | | | | This will be used by the fragment shader prolog. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement forcing per-sample_interpolation using the shader key onlyMarek Olšák2016-02-096-152/+55
| | | | | | | | | | | It was partly a state and partly emulated by shader code, but since we want to do this in a fragment shader prolog, we need to put it into the shader key, which will be used to generate the prolog. This also removes the spi_ps_input states and moves the registers to the PS state. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove si_shader::ps_input_interpolateMarek Olšák2016-02-092-6/+3
| | | | | | tgsi_shader_info has this too. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move BCOLOR PS input locations after all other inputsMarek Olšák2016-02-093-29/+50
| | | | | | | | | | | | | | | | BCOLOR inputs were immediately after COLOR inputs. Thus, all following inputs were offset by 1 if color_two_side was enabled, and not offset if it was not enabled, which is a variation that's problematic if we want to have 1 variant per shader and the variant doesn't care about color_two_side (that should be handled by other bytecode attached at the beginning). Instead, move BCOLOR inputs after all other inputs, so BCOLOR0 is at location "num_inputs" if it's present. BCOLOR1 is next. This also allows removing si_shader::nparam and si_shader::ps_input_param_offset, which are useless now. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move SPI_PS_INPUT_CNTL value computation to a separate functionMarek Olšák2016-02-091-34/+40
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: generate a color_two_side variant only if the shader reads colorsMarek Olšák2016-02-091-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move si_shader_context initialization into a separate functionMarek Olšák2016-02-091-43/+60
| | | | | | This will be re-used later. Reviewed-by: Nicolai Hähnle <[email protected]>
* nv50: add PIPE_QUERY_OCCLUSION_PREDICATE supportIlia Mirkin2016-02-091-0/+6
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv30: add PIPE_QUERY_OCCLUSION_PREDICATE supportIlia Mirkin2016-02-091-2/+5
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* ilo: add PIPE_QUERY_OCCLUSION_PREDICATE supportIlia Mirkin2016-02-093-1/+12
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* gallium/util: switch over to new u_debug_image.[ch] codeBrian Paul2016-02-083-0/+3
| | | | Reviewed-by: Marek Olšák <[email protected]>
* trace: add missing pipe_context::clear_texture()Samuel Pitoiset2016-02-081-0/+28
| | | | | | | | This fixes a crash with bin/arb_clear_texture-base-formats and probably some other tests which use clear_texture(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* trace: remove useless MALLOC() in trace_context_draw_vbo()Samuel Pitoiset2016-02-081-11/+6
| | | | | | | There is no need to allocate memory when unwrapping the indirect buf. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* r600, compute: Do not overwrite pipe_resource.screenJan Vesely2016-02-051-1/+1
| | | | | | | found by inspection. Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* r600g: Ignore format for PIPE_BUFFER targetsJan Vesely2016-02-051-1/+1
| | | | | | | | Fixes compute since 7dd31b81fee7fe40bd09cf3fbc324fcc32782479 gallium/radeon: support PIPE_CAP_SURFACE_REINTERPRET_BLOCKS Signed-off-by: Jan Vesely <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium/radeon: implement query_memory_info (v2)Marek Olšák2016-02-053-2/+39
| | | | | | | v2: don't use DIV_ROUND_UP (no so useful) also return eviction stats Reviewed-by: Alex Deucher <[email protected]>
* gallium: add interface for querying memory usage and sizes (v2)Marek Olšák2016-02-0513-0/+13
| | | | | | | | | | If you're worried about the duplication of some CAPs, we can remove them later. v2: add fields for memory eviction stats Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* gallium/radeon: remove radeon_info::r600_tiling_configMarek Olšák2016-02-052-2/+0
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: get pipe_interleave_bytes AKA group_bytes from the winsysMarek Olšák2016-02-055-65/+6
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: set num_banks in the winsysMarek Olšák2016-02-056-32/+8
| | | | | | | amdgpu doesn't have to set this, because radeonsi gets it from tile mode arrays by default. Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: just get num_tile_pipes from the winsysMarek Olšák2016-02-055-91/+4
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: rename & reorder members of radeon_infoMarek Olšák2016-02-0510-63/+71
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add placeholder MC and SRBM performance counter groupsNicolai Hähnle2016-02-051-16/+54
| | | | | | | | | | Yet another change motivated by AMD GPUPerfStudio compatibility. These groups are not directly accessible from userspace, and AMD GPUPerfStudio does not actually query them - it just requires them to be there. Hence, adding a placeholder for now. Reviewed-by: Edward O'Callaghan <[email protected]> Acked-by: Marek Olšák <[email protected]>
* radeonsi: re-order the SQ_xx performance counter blocksNicolai Hähnle2016-02-053-39/+42
| | | | | | | | This is yet another change motivated by appeasing AMD GPUPerfStudio's hardcoding of performance counter group numbers. Reviewed-by: Edward O'Callaghan <[email protected]> Acked-by: Marek Olšák <[email protected]>
* radeonsi: re-order the perfcounter hardware blocksNicolai Hähnle2016-02-051-12/+18
| | | | | | | | | As documented in the comment, AMD GPUPerfStudio unfortunately hardcodes the order of performance counter groups. Let's do the pragmatic thing and present the same order as Catalyst/Crimson. Reviewed-by: Edward O'Callaghan <[email protected]> Acked-by: Marek Olšák <[email protected]>
* gallium/radeon: add GPIN driver query groupNicolai Hähnle2016-02-052-3/+87
| | | | | | | | | This group was used by older versions of AMD GPUPerfStudio (via AMD_performance_monitor) to identify the GPU family, and GPUPerfStudio still complains when it isn't available. Reviewed-by: Edward O'Callaghan <[email protected]> Acked-by: Marek Olšák <[email protected]>
* radeonsi: Allow dumping LLVM IR before optimization passesNicolai Hähnle2016-02-053-2/+16
| | | | | | | | | | | | | | Set R600_DEBUG=preoptir to dump the LLVM IR before optimization passes, to allow diagnosing problems caused by optimization passes. Note that in order to compile the resulting IR with llc, you will first have to run at least the mem2reg pass, e.g. opt -mem2reg -S < shader.ll | llc -march=amdgcn -mcpu=bonaire Signed-off-by: Michel Dänzer <[email protected]> (original patch) Signed-off-by: Nicolai Hähnle <[email protected]> (w/ debug flag) Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: emit LLVM `ret void` before radeon_llvm_finalize_moduleNicolai Hähnle2016-02-053-3/+4
| | | | | | | This allows dumping a consumable LLVM module before the initial optimization passes are run. Reviewed-by: Marek Olšák <[email protected]>
* nvc0: avoid negatives in PUSH_SPACE argumentIlia Mirkin2016-02-051-2/+1
| | | | | | | | | | Fixup to commit 03b3eb90d - the number of buffers could be larger than the number of elements, in which case we'd pass a negative argument to PUSH_SPACE, which would be bad. While we're at it, merge it with the other PUSH_SPACE at the top of the function. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nvc0: add some missing PUSH_SPACE'sIlia Mirkin2016-02-051-1/+9
| | | | | | | | nvc0_vbo has explicit push space checking enabled, so we must run PUSH_SPACE by hand. A few spots missed that. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nvc0/ir: fix converting between predicate and gprIlia Mirkin2016-02-053-11/+41
| | | | | | | | | | The spill logic will insert convert ops when moving between files. It seems like the emission logic wasn't quite ready for these converts. Tested on fermi, and visually looked at nvdisasm output for maxwell. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nvc0: add support for ARB_query_buffer_objectIlia Mirkin2016-02-0411-20/+239
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add PIPE_CAP_QUERY_BUFFER_OBJECTIlia Mirkin2016-02-0413-0/+13
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: implement PK2H and UP2H opcodesMarek Olšák2016-02-042-1/+75
| | | | | | | | | | | Based on a gallivm patch by Ilia Mirkin. +8 piglit regressions due to precision issues (I blame the tests) The benefit is that we'll get v_cvt_f32_f16 and v_cvt_f16_f32 instead of emulation with integer instructions. They are GLSL 4.00 intrinsics. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix Hyper-Z on StoneyMarek Olšák2016-02-041-0/+4
| | | | | Cc: 11.0 11.1 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nv50/ir: make sure to fetch all sources before creating instructionIlia Mirkin2016-02-031-5/+8
| | | | | | | | | We must fetch all sources into the instruction stream before generating the instruction that uses them. Otherwise we'll define values after using them, which won't work so well. Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Samuel Pitoiset <[email protected]>
* nv50: avoid freeing the symbols if they're about to be storedIlia Mirkin2016-02-031-2/+7
| | | | | | | Spotted by Coverity Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallium/radeon: support PIPE_CAP_SURFACE_REINTERPRET_BLOCKSNicolai Hähnle2016-02-033-5/+25
| | | | | | | | | This is already used internally in si_resource_copy_region for compressed textures, so the only real change here is the adjusted surface size computation. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium: Add PIPE_CAP_SURFACE_REINTERPRET_BLOCKSNicolai Hähnle2016-02-0314-0/+14
| | | | | | | | | | This cap indicates whether pipe->create_surface can reinterpret a texture as a surface with a format of different block width/height (but equal block size). v2: fix whitespace Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium: Add PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLYNicolai Hähnle2016-02-0314-0/+22
| | | | | | | | | This cap indicates that the driver only supports R, RG, RGB and RGBA formats for PIPE_BUFFER sampler views. v2: move into "unsupported features" section for nouveau (Ilia Mirkin) Reviewed-by: Edward O'Callaghan <[email protected]>
* llvmpipe: use scissor_planes_needed helper functionRoland Scheidegger2016-02-033-18/+33
| | | | So it doesn't get out of sync in multiple places.
* radeonsi: rework RB+ for StoneyMarek Olšák2016-02-024-109/+228
| | | | | | | | | | | | | | | This fixes it. States which also need to be taken into account: - SPI color formats - each down-conversion format supports only a limited set of SPI formats - whether MSAA resolving and logic op are enabled These need special handling: - blending - disabled channels Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: rename cb_target_mask state to cb_render_stateMarek Olšák2016-02-025-14/+15
| | | | | | | | and rename a variable in the function. SX_PS_DOWNCONVERT will be emitted here. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: treat intensity render targets exactly like redMarek Olšák2016-02-021-1/+3
| | | | | | | | | The motivation is to simplify the Stoney RB+ code. Intensity is already treated as red except here. No piglit regressions. Reviewed-by: Nicolai Hähnle <[email protected]>