mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	mesa/cs: Add a MESA_SHADER_COMPUTE stage and update switch statements.	Paul Berry	2014-02-05	9	-1/+52
\| \| \| \| \| \| \| \| \|	This patch adds MESA_SHADER_COMPUTE to the gl_shader_stage enum. Also, where it is trivial to do so, it adds a compute shader case to switch statements that switch based on the type of shader. This avoids "unhandled switch case" compiler warnings. Reviewed-by: Matt Turner <[email protected]>
*	glsl/cs: Change some linker loops to use MESA_SHADER_FRAGMENT as a bound.	Paul Berry	2014-02-05	1	-4/+4
\| \| \| \| \| \| \| \| \|	Linker loops that iterate through all the stages in the pipeline need to use MESA_SHADER_FRAGMENT as a bound, so that we can add an additional MESA_SHADER_COMPUTE stage, without it being erroneously included in the pipeline. Reviewed-by: Matt Turner <[email protected]>
*	mesa/cs: Add dispatch API stubs for ARB_compute_shader.	Paul Berry	2014-02-05	9	-3/+141
\| \| \| \|	Reviewed-by: Matt Turner <[email protected]>
*	mesa/cs: Add extension enable flags for ARB_compute_shader.	Paul Berry	2014-02-05	6	-0/+9
\| \| \| \|	Reviewed-by: Matt Turner <[email protected]>
*	gallivm: fix F2U opcode	Roland Scheidegger	2014-02-05	1	-20/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we were really doing F2I. And also move it to generic section. (Note that for llvmpipe the code generated is definitely bad, due to lack of unsigned conversions with sse. I think though what llvm does (using scalar conversions to 64bit signed either with x87 fpu (32bit) or sse (64bit) including lots of domain changes is quite suboptimal, could do something like is_large = arg >= 2^31 half_arg = 0.5 * arg small_c = fptoint(arg) large_c = fptoint(half_arg) << 1 res = select(is_large, large_c, small_c) which should be much less instructions but that's something llvm should do itself.) This fixes piglit fs/vs-float-uint-conversion.shader_test (maybe more, needs GL 3.0 version override to run.) Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
*	tools/trace: Handle index buffer overflow gracefully.	José Fonseca	2014-02-05	1	-1/+4
\| \| \| \|	Trivial.
*	docs/GL3.txt: update r600 status	Dave Airlie	2014-02-05	1	-18/+18
\| \| \| \| \| \|	This updates the r600 driver status to 3.3 being fully supported. Signed-off-by: Dave Airlie <[email protected]>
*	r600g: add support for geom shaders to r600/r700 chipsets (v2)	Dave Airlie	2014-02-05	7	-49/+313
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is my first attempt at enabling r600/r700 geometry shaders, the basic tests pass on both my rv770 and my rv635, It requires this kernel patch: http://www.spinics.net/lists/dri-devel/msg52745.html v2: address Alex comments. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: enable GLSL 3.30 on evergreen GPUs	Dave Airlie	2014-02-05	1	-1/+1
\| \| \| \| \| \| \|	This throws the switch to enable GL 3.3 and GLSL 330. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: properly propogate clip dist write value	Dave Airlie	2014-02-05	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	This moves the value from the GS shader to the copy shader so the registers are setup correctly. fixes tests/spec/glsl-1.50/execution/geometry/clip-distance-out-values.shader_test Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: calculate a better value for array_size (v2)	Dave Airlie	2014-02-05	1	-1/+1
\| \| \| \| \| \| \| \| \|	attempt to calculate a better value for array size to avoid breaking apps. v2: use 0xfff like streamout, suggested by Grigori Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: fix CAYMAN geometry shader support	Dave Airlie	2014-02-05	1	-2/+6
\| \| \| \| \| \| \| \| \|	cayman has a different end of program bit, so do that properly. fixes hangs with geom shader tests on cayman. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: fix up shader out misc stuff for copy shader	Dave Airlie	2014-02-05	2	-1/+16
\| \| \| \| \| \| \| \| \| \| \|	set the correct values so the misc out register is setup correctly for the copy shader. This also updates the state for the gs copy shader so the hw gets programmed correctly. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: port the layered surface rendering patch from radeonsi	Dave Airlie	2014-02-05	3	-21/+19
\| \| \| \| \| \| \| \| \|	This just makes r600 and evergreen do what the radeonsi codepaths do for layered rendering. This makes the 2d amd_vertex_shader_layer test pass on evergreen. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: initial VS output layer support	Dave Airlie	2014-02-05	4	-14/+50
\| \| \| \| \| \| \|	This just adds support for emitting the proper value in the VS out misc. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: setup const texture buffers for geom shaders	Dave Airlie	2014-02-05	1	-0/+6
\| \| \| \| \| \| \| \|	This just enables the workarounds we have for vertex/pixel shaders for geom shaders as well. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: calculate correct cut value	Dave Airlie	2014-02-05	1	-1/+11
\| \| \| \| \| \| \|	This selects the cut value depending on the shader selected. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: fix dynamic_input_array_index.shader_test	Dave Airlie	2014-02-05	1	-4/+44
\| \| \| \| \| \| \| \| \|	This follows what fglrx does, it unpacks the input we are going to indirect into a bunch of registers and indirects inside them. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: add support for indirect geom ring writes	Dave Airlie	2014-02-05	1	-7/+58
\| \| \| \| \| \| \| \| \| \| \|	We need to be able to write to the ring using a base register for when we emit vertices in a loop, in theory the SB compiler could collapse these indirect writes to direct writes if the register value is constant and known, but that is outside my pay grade. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: write proper output prim type	Dave Airlie	2014-02-05	2	-27/+26
\| \| \| \| \| \| \| \|	Vadim's code derived it from the info.mode, but it needs to be takes from the geometry shader output primitive. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: enable instance cnt register with new enough kernel	Dave Airlie	2014-02-05	1	-6/+6
\| \| \| \| \| \| \| \|	The instance cnt register was missing for a few kernels, with a new enough kernel we can output it. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: add primitive input support for gs	Dave Airlie	2014-02-05	4	-1/+19
\| \| \| \| \| \| \|	only enable prim id if gs uses it Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: emit streamout from dma copy shader	Dave Airlie	2014-02-05	2	-2/+8
\| \| \| \| \| \| \| \|	This enables streamout with GS in the mix, from the VS dma shader. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g/gs: fix cases where number of gs inputs != number of gs outputs	Dave Airlie	2014-02-05	1	-1/+6
\| \| \| \| \| \| \|	this fixes a bunch of the geom shader built-in tests Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: increase array base for exported parameters	Dave Airlie	2014-02-05	1	-0/+3
\| \| \| \| \| \| \|	Trivial fix to Vadim's code. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: initialise the geom shader loop registers.	Dave Airlie	2014-02-05	1	-0/+2
\| \| \| \| \| \| \|	As we do for vertex and pixel shaders. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: emit NOPs at end of shaders in more cases	Dave Airlie	2014-02-05	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \|	If the shader has no CF clauses at all emit an nop If the last instruction is an ENDLOOP add a NOP for the LOOP to go to if the last instruction is CALL_FS add a NOP These fix a bunch of hangs in the geometry shader tests. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: don't enable SB for geom shaders	Dave Airlie	2014-02-05	1	-0/+3
\| \| \| \| \| \| \| \|	SB needs fixes for three GS instructions it seems to raise them outside loops etc despite my best efforts. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g/sb: add MEM_RING support	Dave Airlie	2014-02-05	4	-5/+8
\| \| \| \| \| \| \| \|	Although we don't use SB on geom shaders, the VS copy shader will use it so we might as well implement MEM_RING support in sb. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: don't fail if we can't map VS->GS ring entries	Dave Airlie	2014-02-05	1	-4/+3
\| \| \| \| \| \| \| \|	This can happen in normal operation, so don't report an error on it, just continue. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: initial support for geometry shaders on evergreen (v2)	Vadim Girlin	2014-02-05	15	-206/+909
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is Vadim's initial work with a few regression fixes squashed in. v2: (airlied) fix regression in glsl-max-varyings - need to use vs and ps_dirty fix regression in shader exports from rebasing. whitespace fixing. v2.1: squash fix assert Signed-off-by: Vadim Girlin <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: add hw register definitions for GS block setup	Vadim Girlin	2014-02-05	2	-6/+75
\| \| \| \| \|	Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: defer shader variant selection and depending state updates	Vadim Girlin	2014-02-05	3	-69/+57
\| \| \| \| \| \| \| \|	[airlied: fix dropped streamout line - fix for master] Signed-off-by: Vadim Girlin <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g/bc: add support for indexed memory writes.	Dave Airlie	2014-02-05	3	-4/+12
\| \| \| \| \| \| \|	It looks like we need these for geom shaders in the future. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: move barrier and end_of_program bits from output to cf struct (v2)	Vadim Girlin	2014-02-05	4	-30/+34
\| \| \| \| \| \| \| \|	v2: fix regression on r600 NOP instructions. Signed-off-by: Vadim Girlin <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	r600g: split streamout emit code into a separate function	Dave Airlie	2014-02-05	1	-103/+110
\| \| \| \| \| \| \| \| \|	For geometry shaders we need to call this code from a second place. Just move it out for now to keep future patches cleaner. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	r600g,radeonsi: skip unnecessary buffer_is_busy call, add a comment	Marek Olšák	2014-02-04	1	-1/+5
\|
*	r600g,radeonsi: skip busy-checking for DISCARD_RANGE if it has been done already	Marek Olšák	2014-02-04	1	-0/+4
\|
*	r600g,radeonsi: treat DYNAMIC and STREAM usage as STAGING	Marek Olšák	2014-02-04	1	-7/+3
\|
*	gallium: remove PIPE_CAP_MAX_COMBINED_SAMPLERS	Marek Olšák	2014-02-04	15	-38/+6
\| \| \| \| \| \| \|	This can be derived from the shader caps. All GPUs from ATI/AMD, NVIDIA, and INTEL have separate texture slots for each shader stage.
*	mesa: remove stray bits of GL_EXT_cull_vertex	Brian Paul	2014-02-04	2	-15/+1
\| \| \| \| \| \| \|	GL_EXT_cull_vertex was removed back in 2010 in commit 02984e3536 but these bits still lingered. Reviewed-by: Eric Anholt <[email protected]>
*	glsl: Fix continue statements in do-while loops.	Paul Berry	2014-02-04	1	-9/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From the GLSL 4.40 spec, section 6.4 (Jumps): The continue jump is used only in loops. It skips the remainder of the body of the inner most loop of which it is inside. For while and do-while loops, this jump is to the next evaluation of the loop condition-expression from which the loop continues as previously defined. Previously, we incorrectly treated a "continue" statement as jumping to the top of a do-while loop. This patch fixes the problem by replicating the loop condition when converting the "continue" statement to IR. (We already do a similar thing in "for" loops, to ensure that "continue" causes the loop expression to be executed). Fixes piglit tests: - glsl-fs-continue-inside-do-while.shader_test - glsl-vs-continue-inside-do-while.shader_test - glsl-fs-continue-in-switch-in-do-while.shader_test - glsl-vs-continue-in-switch-in-do-while.shader_test Cc: [email protected] Acked-by: Carl Worth <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Make condition_to_hir() callable from outside ast_iteration_statement.	Paul Berry	2014-02-04	2	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In addition to making it public, we also need to change its first argument from an ir_loop * to an exec_list *, so that it can be used to insert the condition anywhere in the IR (rather than just in the body of the loop). This will be necessary in order to make continue statements work properly in do-while loops. Cc: [email protected] Acked-by: Carl Worth <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/blorp: do not use unnecessary hw-blending support	Topi Pohjolainen	2014-02-04	1	-20/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is really not needed as blorp blit programs already sample XRGB normally and get alpha channel set to 1.0 automatically by the sampler engine. This is simply copied directly to the payload of the render target write message and hence there is no need for any additional blending support from the pixel processing pipeline. The blending formula is anyway broken for color components, it multiplies the color component with itself (blend factor is the component itself). Alpha blending in turn would not fix the alpha to one independent of the source but simply used the source alpha as is instead (1.0 * src_alpha + 0.0 * dst_alpha). Quoting Eric: "If we want to actually make the no-alpha-bits-present thing work, we need to override the bits in the surface state or in the generated code. In the normal draw path, it's done for sampling by the swizzling code in brw_wm_surface_state.c, and the blending overrides is just to fix up the alpha blending stage which doesn't pay attention to that for the destination surface." If one modifies piglit test gl-3.2-layered-rendering-blit to use color component values other than zero or one, this change will kick in on IVB. No regressions on IVB. This is effectively revert of c0554141a9b831b4e614747104dcbbe0fe489b9d: i965/blorp: Support overriding destination alpha to 1.0. Currently, Blorp requires the source and destination formats to be equal. However, we'd really like to be able to blit between XRGB and ARGB formats; our BLT engine paths have supported this for a long time. For ARGB -> XRGB, nothing needs to occur: the missing alpha is already interpreted as 1.0. For XRGB -> ARGB, we need to smash the alpha channel to 1.0 when writing the destination colors. This is fairly straightforward with blending. For now, this code is never used, as the source and destination formats still must be equal. The next patch will relax that restriction. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
*	radeon/uvd: fix feedback buffer handling v2	Christian König	2014-02-04	1	-12/+28
\| \| \| \| \| \| \| \| \| \| \|	Without the correct feedback buffer size UVD runs into an error on each frame, reducing the maximum FPS. v2: fixing Michels comments Signed-off-by: Christian König <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Cc: "10.1" "10.0" "9.2" <[email protected]>
*	i965: Use brw_bo_map[_gtt]() in intel_miptree_map_raw().	Kenneth Graunke	2014-02-03	1	-8/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	This moves the intel_batchbuffer_flush before the drm_intel_bo_busy call, which is a change in behavior. However, the old behavior was broken. In the future, we may want to only flush in the batchbuffer references the BO being mapped. That's certainly more typical. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Carl Worth <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965: Use brw_bo_map() in intel_texsubimage_tiled_memcpy().	Kenneth Graunke	2014-02-03	1	-7/+1
\| \| \| \| \| \| \| \| \|	This additionally measures the time stalled, while also simplifying the code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Carl Worth <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965: Create drm_intel_bo_map wrappers with performance warnings.	Kenneth Graunke	2014-02-03	2	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mapping a buffer is a common place where we could stall the CPU. In a few places, we've added special code to check whether a buffer is busy and log the stall as a performance warning. Most of these give no indication of the severity of the stall, though, since measuring the time is a small hassle. This patch introduces a new brw_bo_map() function which wraps drm_intel_bo_map, but additionally measures the time stalled and reports a performance warning. If performance debugging is not enabled, it simply maps the buffer with negligable overhead. We also add a similar wrapper for drm_intel_gem_bo_map_gtt(). This should make it easy to add performance warnings in lots of places. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Carl Worth <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	freedreno: enabling binning and opt by default	Rob Clark	2014-02-03	3	-16/+11
\| \| \| \| \| \| \| \| \|	Hw binning pass doesn't seem to have broken anything. And optimizing compiler fixes a lot of shaders and doesn't seem to break anything. So re-org slightly FD_MESA_DEBUG params and make both hw binning and optimizer enabled by default. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx/compiler: new compiler	Rob Clark	2014-02-03	17	-209/+2777
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new compiler generates a dependency graph of instructions, including a few meta-instructions to handle PHI and preserve some extra information needed for register assignment, etc. The depth pass assigned a weight/depth to each node (based on sum of instruction cycles of a given node and all it's dependent nodes), which is used to schedule instructions. The scheduling takes into account the minimum number of cycles/slots between dependent instructions, etc. Which was something that could not be handled properly with the original compiler (which was more of a naive TGSI translator than an actual compiler). The register assignment is currently split out as a standalone pass. I expect that it will be replaced at some point, once I figure out what to do about relative addressing (which is currently the only thing that should cause fallback to old compiler). There are a couple new debug options for FD_MESA_DEBUG env var: optmsgs - enable debug prints in optimizer optdump - dump instruction graph in .dot format, for example: http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot.png http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot At this point, thanks to proper handling of instruction scheduling, the new compiler fixes a lot of things that were broken before, and does not appear to break anything that was working before[1]. So even though it is not finished, it seems useful to merge it in it's current state. [1] Not merged in this commit, because I'm not sure if it really belongs in mesa tree, but the following commit implements a simple shader emulator, which I've used to compare the output of the new compiler to the original compiler (ie. run it on all the TGSI shaders dumped out via ST_DEBUG=tgsi with various games/apps): https://github.com/freedreno/mesa/commit/163b6306b1660e05ece2f00d264a8393d99b6f12 Signed-off-by: Rob Clark <[email protected]>