aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* radeonsi: use an indirect buffer for init_configMarek Olšák2015-09-012-0/+3
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: add IB2 indirect buffer support for pm4 statesMarek Olšák2015-09-013-2/+54
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* winsys/radeon: add a flag telling how gfx IBs should be paddedMarek Olšák2015-09-013-6/+10
| | | | | | | This is always false on amdgpu (set by calloc). Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* winsys/amdgpu: remove IB padding for SIMarek Olšák2015-09-011-17/+5
| | | | | | | SI is unsupported by amdgpu Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: remove unused macro si_pm4_set_stateMarek Olšák2015-09-011-10/+0
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: remove si_pm4_cleanupMarek Olšák2015-09-013-10/+0
| | | | | | | All remaining pm4 state are created and destroyed by state trackers. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: rework uploading border colorsMarek Olšák2015-09-015-92/+75
| | | | | | | | | | | | | | The border colors are uploaded only once when the state is created. This brings truly immutable sampler descriptors, because they don't have to be updated every time a sampler state is re-bound. It also moves the TA_BC_BASE_ADDR registers to init_config, removing one more state. The catch is there is now a limit: only 4096 border colors can be used by one context. I don't think that will be a problem. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: use all built-in border colorsMarek Olšák2015-09-011-3/+18
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: inline si_cmd_context_controlMarek Olšák2015-09-014-41/+4
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: remove unused si_pm4_state codeMarek Olšák2015-09-012-28/+2
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: reorder si_context variablesMarek Olšák2015-09-011-40/+45
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: don't send IB dword usage to si_need_cs_spaceMarek Olšák2015-09-015-8/+6
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: don't set number of IB dwords for statesMarek Olšák2015-09-014-29/+18
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: don't count IB space for states, just use an upper boundMarek Olšák2015-09-012-55/+5
| | | | | | | | Since we don't put any resource descriptors in IBs, the space used by draw calls is quite small. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert SPI state to an atomMarek Olšák2015-09-014-10/+19
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* gallium/radeon: rename r600_context_bo_reloc -> radeon_add_to_buffer_listMarek Olšák2015-09-0114-87/+97
| | | | | | | this name should be easy to understand without other knowledge Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* gallium/radeon: rename write_*_reg functionsMarek Olšák2015-09-0111-220/+220
| | | | | | | | e.g. radeon_set_context_reg is nicer and looks consistent next to radeon_emit(). Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: rename and precalculate polygon offset statesMarek Olšák2015-09-012-40/+45
| | | | | | | one less calloc and state construction while drawing Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert CB_TARGET_MASK setup to an atomMarek Olšák2015-09-015-17/+13
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: don't set VGT_VTX_CNT_EN twice in init_configMarek Olšák2015-09-011-1/+0
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert stencil ref state into an atomMarek Olšák2015-09-015-39/+50
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert blend color state into an atomMarek Olšák2015-09-014-9/+20
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert sample mask state into an atomMarek Olšák2015-09-016-19/+25
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert clip state into an atomMarek Olšák2015-09-014-14/+19
| | | | | | | Reducing calloc overhead. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: avoid redundant CB and DB register updatesMarek Olšák2015-09-017-12/+36
| | | | | | | | | | The main idea is to avoid setting CB_COLORi_INFO = 0 for i>0 repeatedly when those colorbuffers aren't used. This is mainly for glamor. Same for DB. Z_INFO and STENCIL_INFO need to be cleared only once. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: don't rebind GSVS ring buffers every draw call using GSMarek Olšák2015-09-013-3/+10
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: don't clear the tessellation factor ring bufferMarek Olšák2015-09-011-2/+0
| | | | | | | Leftover from the bring-up. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: remove the tf_ring state, add the registers to init_configMarek Olšák2015-09-014-15/+13
| | | | | | | One less state to worry about. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: remove the gs_rings state, add the registers to init_configMarek Olšák2015-09-014-16/+14
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: use a bitmask for tracking dirty atomsMarek Olšák2015-09-013-13/+18
| | | | | | | | This mainly removes the cache misses when checking the dirty flags. Not much else though. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: initialize atom IDs for external atomsMarek Olšák2015-09-012-4/+13
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: call si_init_atom for remaining radeonsi atomsMarek Olšák2015-09-018-41/+29
| | | | | | | | | | I need to initialize more atom IDs. This adds 4 more si_init_atom calls, which simplifies the code. (si_init_atom needs a different context type of the emit functions though) Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: initialize atom IDsMarek Olšák2015-09-011-6/+8
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: define the state atom array separatelyMarek Olšák2015-09-014-21/+23
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: optimize viewport statesMarek Olšák2015-09-016-26/+54
| | | | | | | same as scissors Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: optimize scissor statesMarek Olšák2015-09-018-27/+79
| | | | | | | | | - convert 16 states to 1 atom - only emit 1 scissor if VIEWPORT_INDEX isn't written - use only one packet when emitting consecutive scissors Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: add SI_MAX_ATTRIBSMarek Olšák2015-09-012-5/+6
| | | | | | | PIPE_MAX_ATTRIBS is 32, but we currently only support 16. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: fix memory usage checking for big IBsMarek Olšák2015-09-011-8/+9
| | | | | | Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: set all 16 viewport Z bounds for GL 4.1Marek Olšák2015-09-011-2/+6
| | | | | | Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: fix a Unigine Heaven hang when drirc is missingMarek Olšák2015-09-014-1/+28
| | | | | | Cc: 10.6 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* winsys/amdgpu: use small IBs for better performance on VIMarek Olšák2015-09-011-7/+9
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* gallium/util: add u_bit_scan_consecutive_rangeMarek Olšák2015-09-011-0/+20
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* i965: Prevent coordinate overflow in intel_emit_linear_blitChris Wilson2015-09-011-38/+34
| | | | | | | | | | | | | | | | | | | | | | | | | Fixes regression from commit 8c17d53823c77ac1c56b0548e4e54f69a33285f1 Author: Kenneth Graunke <[email protected]> Date: Wed Apr 15 03:04:33 2015 -0700 i965: Make intel_emit_linear_blit handle Gen8+ alignment restrictions. which adjusted the coordinates to be relative to the nearest cacheline. However, this then offsets the coordinates by up to 63 and this may then cause them to overflow the BLT limits. For the well aligned large transfer case, we can use 32bpp pixels and so reduce the coordinates by 4 (versus the current 8bpp pixels). We also have to be more careful doing the last line just in case it may exceed the coordinate limit. Reported-and-tested-by: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90734 Signed-off-by: Chris Wilson <[email protected]> Cc: Kenneth Graunke <[email protected]> Cc: Ian Romanick <[email protected]> Cc: Anuj Phogat <[email protected]> Cc: [email protected] Reviewed-by: Anuj Phogat <[email protected]>
* i965/nir: enable the dead control flow optimizationConnor Abbott2015-09-011-0/+2
| | | | | | | | total instructions in shared programs: 7541551 -> 7541381 (-0.00%) instructions in affected programs: 3054 -> 2884 (-5.57%) helped: 29 Reviewed-by: Kenneth Graunke <[email protected]>
* nir/dead_cf: add support for removing useless loopsConnor Abbott2015-09-011-12/+109
| | | | | | | | | | | | v2: fix detecting if the loop has any phi nodes after it. v2: use nir_foreach_ssa_def() instead of nir_foreach_dest() when checking for values live after the loop to catch const_load instructions. v2: fix handling return instructions v2: add some documentation to loop_is_dead() Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: add a helper for iterating over blocks in a cf nodeConnor Abbott2015-09-012-0/+9
| | | | | | | We were already doing this internally for iterating over a function implementation, so just expose it directly. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: add nir_block_get_following_loop() helperConnor Abbott2015-09-012-0/+18
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir/dead_cf: delete code that's unreachable due to jumpsConnor Abbott2015-09-011-8/+115
| | | | | | | | v2: use nir_cf_node_remove_after(). v2: use foreach_list_typed() instead of hardcoding a list walk. v3: update to new control flow modification helpers. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: add an optimization for removing dead control flowConnor Abbott2015-09-013-0/+158
| | | | | | | v2: use nir_cf_node_remove_after() instead of our own broken thing. v3: use the new control flow modification helpers. Reviewed-by: Kenneth Graunke <[email protected]>
* r600g: fix calculation for gpr allocationDave Airlie2015-09-011-1/+1
| | | | | | | | | | | | | | | I've been chasing a geom shader hang on rv635 since I wrote r600 geom code, and finally I hacked some values from fglrx in and I could run texelfetch without failures. This is totally my fault as well, maths fail 101. This makes geom shaders on r600 not fail heavily. Cc: "10.6" "11.0" <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>