summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gallium/radeon: rename r600_context_bo_reloc -> radeon_add_to_buffer_listMarek Olšák2015-09-0114-87/+97
| | | | | | | this name should be easy to understand without other knowledge Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* gallium/radeon: rename write_*_reg functionsMarek Olšák2015-09-0111-220/+220
| | | | | | | | e.g. radeon_set_context_reg is nicer and looks consistent next to radeon_emit(). Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: rename and precalculate polygon offset statesMarek Olšák2015-09-012-40/+45
| | | | | | | one less calloc and state construction while drawing Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert CB_TARGET_MASK setup to an atomMarek Olšák2015-09-015-17/+13
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: don't set VGT_VTX_CNT_EN twice in init_configMarek Olšák2015-09-011-1/+0
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert stencil ref state into an atomMarek Olšák2015-09-015-39/+50
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert blend color state into an atomMarek Olšák2015-09-014-9/+20
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert sample mask state into an atomMarek Olšák2015-09-016-19/+25
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert clip state into an atomMarek Olšák2015-09-014-14/+19
| | | | | | | Reducing calloc overhead. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: avoid redundant CB and DB register updatesMarek Olšák2015-09-017-12/+36
| | | | | | | | | | The main idea is to avoid setting CB_COLORi_INFO = 0 for i>0 repeatedly when those colorbuffers aren't used. This is mainly for glamor. Same for DB. Z_INFO and STENCIL_INFO need to be cleared only once. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: don't rebind GSVS ring buffers every draw call using GSMarek Olšák2015-09-013-3/+10
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: don't clear the tessellation factor ring bufferMarek Olšák2015-09-011-2/+0
| | | | | | | Leftover from the bring-up. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: remove the tf_ring state, add the registers to init_configMarek Olšák2015-09-014-15/+13
| | | | | | | One less state to worry about. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: remove the gs_rings state, add the registers to init_configMarek Olšák2015-09-014-16/+14
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: use a bitmask for tracking dirty atomsMarek Olšák2015-09-013-13/+18
| | | | | | | | This mainly removes the cache misses when checking the dirty flags. Not much else though. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: initialize atom IDs for external atomsMarek Olšák2015-09-012-4/+13
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: call si_init_atom for remaining radeonsi atomsMarek Olšák2015-09-018-41/+29
| | | | | | | | | | I need to initialize more atom IDs. This adds 4 more si_init_atom calls, which simplifies the code. (si_init_atom needs a different context type of the emit functions though) Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: initialize atom IDsMarek Olšák2015-09-011-6/+8
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: define the state atom array separatelyMarek Olšák2015-09-014-21/+23
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: optimize viewport statesMarek Olšák2015-09-016-26/+54
| | | | | | | same as scissors Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: optimize scissor statesMarek Olšák2015-09-018-27/+79
| | | | | | | | | - convert 16 states to 1 atom - only emit 1 scissor if VIEWPORT_INDEX isn't written - use only one packet when emitting consecutive scissors Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: add SI_MAX_ATTRIBSMarek Olšák2015-09-012-5/+6
| | | | | | | PIPE_MAX_ATTRIBS is 32, but we currently only support 16. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: fix memory usage checking for big IBsMarek Olšák2015-09-011-8/+9
| | | | | | Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: set all 16 viewport Z bounds for GL 4.1Marek Olšák2015-09-011-2/+6
| | | | | | Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: fix a Unigine Heaven hang when drirc is missingMarek Olšák2015-09-014-1/+28
| | | | | | Cc: 10.6 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* winsys/amdgpu: use small IBs for better performance on VIMarek Olšák2015-09-011-7/+9
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* gallium/util: add u_bit_scan_consecutive_rangeMarek Olšák2015-09-011-0/+20
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* i965: Prevent coordinate overflow in intel_emit_linear_blitChris Wilson2015-09-011-38/+34
| | | | | | | | | | | | | | | | | | | | | | | | | Fixes regression from commit 8c17d53823c77ac1c56b0548e4e54f69a33285f1 Author: Kenneth Graunke <[email protected]> Date: Wed Apr 15 03:04:33 2015 -0700 i965: Make intel_emit_linear_blit handle Gen8+ alignment restrictions. which adjusted the coordinates to be relative to the nearest cacheline. However, this then offsets the coordinates by up to 63 and this may then cause them to overflow the BLT limits. For the well aligned large transfer case, we can use 32bpp pixels and so reduce the coordinates by 4 (versus the current 8bpp pixels). We also have to be more careful doing the last line just in case it may exceed the coordinate limit. Reported-and-tested-by: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90734 Signed-off-by: Chris Wilson <[email protected]> Cc: Kenneth Graunke <[email protected]> Cc: Ian Romanick <[email protected]> Cc: Anuj Phogat <[email protected]> Cc: [email protected] Reviewed-by: Anuj Phogat <[email protected]>
* i965/nir: enable the dead control flow optimizationConnor Abbott2015-09-011-0/+2
| | | | | | | | total instructions in shared programs: 7541551 -> 7541381 (-0.00%) instructions in affected programs: 3054 -> 2884 (-5.57%) helped: 29 Reviewed-by: Kenneth Graunke <[email protected]>
* nir/dead_cf: add support for removing useless loopsConnor Abbott2015-09-011-12/+109
| | | | | | | | | | | | v2: fix detecting if the loop has any phi nodes after it. v2: use nir_foreach_ssa_def() instead of nir_foreach_dest() when checking for values live after the loop to catch const_load instructions. v2: fix handling return instructions v2: add some documentation to loop_is_dead() Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: add a helper for iterating over blocks in a cf nodeConnor Abbott2015-09-012-0/+9
| | | | | | | We were already doing this internally for iterating over a function implementation, so just expose it directly. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: add nir_block_get_following_loop() helperConnor Abbott2015-09-012-0/+18
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir/dead_cf: delete code that's unreachable due to jumpsConnor Abbott2015-09-011-8/+115
| | | | | | | | v2: use nir_cf_node_remove_after(). v2: use foreach_list_typed() instead of hardcoding a list walk. v3: update to new control flow modification helpers. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: add an optimization for removing dead control flowConnor Abbott2015-09-013-0/+158
| | | | | | | v2: use nir_cf_node_remove_after() instead of our own broken thing. v3: use the new control flow modification helpers. Reviewed-by: Kenneth Graunke <[email protected]>
* r600g: fix calculation for gpr allocationDave Airlie2015-09-011-1/+1
| | | | | | | | | | | | | | | I've been chasing a geom shader hang on rv635 since I wrote r600 geom code, and finally I hacked some values from fglrx in and I could run texelfetch without failures. This is totally my fault as well, maths fail 101. This makes geom shaders on r600 not fail heavily. Cc: "10.6" "11.0" <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa: Limit Framebuffer Parameter OpenGL ES 3.1 usageMarta Lofstedt2015-09-011-1/+17
| | | | | | | | | | | | | | | According to OpenGL ES 3.1 specification, section 9.2.1 for glFramebufferParameter and section 9.2.3 for glGetFramebufferParameteriv: "An INVALID_ENUM error is generated if pname is not FRAMEBUFFER_DEFAULT_WIDTH, FRAMEBUFFER_DEFAULT_HEIGHT, FRAMEBUFFER_DEFAULT_SAMPLES, or FRAMEBUFFER_DEFAULT_FIXED_SAMPLE_LOCATIONS." Therefore exclude OpenGL ES 3.1 from using the GL_FRAMEBUFFER_DEFAULT_LAYERS parameter. Signed-off-by: Marta Lofstedt <[email protected]> Reviewed-by: Kevin Rogovin <kevin.rogovin at intel.com>
* mesa: Expose GL_ARB_framebuffer_no_attachments to GLES 3.1Marta Lofstedt2015-09-015-12/+12
| | | | | | | V2: Conform to new standard for exposing enums for OpenGL ES 3.1. Signed-off-by: Marta Lofstedt <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nir/builder: Use nir_after_instr to advance the cursorJason Ekstrand2015-08-311-2/+1
| | | | | | | | | | | | This *should* ensure that the cursor gets properly advanced in all cases. We had a problem before where, if the cursor was created using nir_after_cf_node on a non-block cf_node, that would call nir_before_block on the block following the cf node. Instructions would then get inserted in backwards order at the top of the block which is not at all what you would expect from nir_after_cf_node. By just resetting to after_instr, we avoid all these problems. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: advertise ASTC support for SkylakeNanley Chery2015-08-311-0/+5
| | | | | | | v2: remove OES ASTC extension reference. Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* mesa/glformats: recognize ASTC formats as color formatsNanley Chery2015-08-311-0/+28
| | | | | | | ASTC formats contain RGBA components. Reviewed-by: Chad Versace <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* mesa/texformat: use format conversion function in _mesa_choose_tex_formatvulkan-protex-2015.09.24.r01-baseNanley Chery2015-08-311-81/+13
| | | | | | | | | | | This function's cases for non-generic compressed formats duplicate the GL to MESA translation in _mesa_glenum_to_compressed_format(). This patch replaces the switch cases with a call to the translation function. This change teaches this function about ASTC, thus enabling ASTC for glTex*Storage*() calls. Reviewed-by: Chad Versace <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* mesa/texcompress: correct mapping of S3TC formats in conversion functionNanley Chery2015-08-311-2/+2
| | | | | | | | | MESA_FORMAT_RGBA_DXT5 should actually be reserved for GL_RGBA[4]_DXT5_S3TC. Also, Gallium and other dri drivers (radeon and nouveau) follow this mapping scheme. Reviewed-by: Chad Versace <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* r600/sb: update last_cf for finalize if.Dave Airlie2015-09-011-0/+3
| | | | | | | | | | | | As Glenn did for finalize_loop we need to update_cf when we add a POP at the end of a shader. I think this fixes one of the earlier shader going off end of memory problems we've stopped. Reviewed-by: Glenn Kennard <[email protected]> Cc: "10.6" "11.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* i965/fs: Use greater-equal cmod to implement maximum.Matt Turner2015-08-312-4/+6
| | | | | | | | | | The docs specifically call out SEL with .l and .ge as the implementations of MIN and MAX respectively. Among other things, SEL with these conditional mods are commutative. See commit 3b7f683f. Reviewed-by: Jordan Justen <[email protected]>
* i965/chv|skl: Apply sampler bypass w/aBen Widawsky2015-08-312-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | Certain compressed formats require this setting. The docs don't go into much detail as to why it's needed exactly. This patch introduces no piglit regressions on gen9 (bsw is untested). Note that the SKL "regressions" are fixed tests, and the egl_khr_gl_colorspace tests are WTF. The patch also fixes nothing I can find. http://otc-mesa-ci.jf.intel.com/job/Leeroy/127820/ v2: Reworded commit message (Matt); Added piglit results link. Restructured condition (Matt) Moved check out to function (Nanley). I left the setting of the bit in the surface state open coded because it seems to go better with the existing code. v3: Use and inline function only in gen8_emit_texture_surface_state() (Matt). Cc: Matt Turner <[email protected]> Cc: Nanley Chery <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* st/mesa: move to renumbering registers in a groupDave Airlie2015-08-311-19/+38
| | | | | | | | This can be done with a single pass for the instruction base, and takes renumber_registers out of its spot on the profile. Acked-by: Marek Olšák <[email protected] Signed-off-by: Dave Airlie <[email protected]>
* st/mesa: reduce time spent in calculating temp read/writesDave Airlie2015-08-311-74/+79
| | | | | | | | | | | | | The glsl->tgsi convertor does some temporary register reduction however in profiling shader-db this shows up quite highly, so optimise things to reduce the number of loops through all the instructions we do. This drops merge_registers from 4-5% on the profile to 1%. I think this can be reduced further by possibly optimising the renumber pass. Acked-by: Marek Olšák <[email protected] Signed-off-by: Dave Airlie <[email protected]>
* st/mesa: cache tgsi opcode info in the instructionDave Airlie2015-08-311-23/+16
| | | | | | | | | Instead of looking this up lots, lets just cache it in the instruction translation up front. I just noticed this function what high in a profile of shader-db on radeonsi. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: move prim convert from geom shader to function.Dave Airlie2015-08-312-25/+26
| | | | | | | This should avoid C++ fail including this header. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* glsl: remove specical case subroutine type countingTimothy Arceri2015-08-311-3/+2
| | | | | | | Unlike samplers we can get the correct value for subroutines from component_slots() Reviewed-by: Dave Airlie <[email protected]>