aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* i965: Split brw_nir_lower_inputs/outputs into per-stage functions.Kenneth Graunke2016-02-261-130/+174
| | | | | | | | These functions are both giant switch statements where most cases don't overlap at all. Let's put the bulk of the work in per-stage helpers. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Remove catch-all nir_lower_io call with specific cases.Kenneth Graunke2016-02-261-1/+4
| | | | | | | | | Most cases already call nir_lower_io explicitly for input and output lowering. This catch all isn't very useful anymore - we can just add it to the remaining cases. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Move optimizations from brw_nir_lower_io to brw_postprocess_nir.Kenneth Graunke2016-02-261-1/+3
| | | | | | | | | | This simplifies things. Every caller of brw_nir_lower_io() immediately calls brw_postprocess_nir(). The only real change this will have is that we get an extra brw_nir_optimize() call when compiling compute shaders, but that seems fine. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Always do NIR IO lowering at specialization time.Kenneth Graunke2016-02-262-8/+1
| | | | | | | | | | | | | | We've now hit literally every case other than geometry shaders (and compute shaders, but those are a no-op). So, let's just move geometry shaders over too and be done with it. The only advantage to doing this at link time was to save the expense of running the pass on recompiles. But we're already running a lot of passes, and the extra code complexity isn't worth it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Make an is_scalar boolean in brw_compile_gs().Kenneth Graunke2016-02-261-4/+4
| | | | | | | | | Shorter than compiler->scalar_stage[MESA_SHADER_GEOMETRY], which can help with line-wrapping. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/nir: Do lower_io late for fragment shadersJason Ekstrand2016-02-262-1/+3
| | | | | | | | | | | | The Vulkan driver wants to be able to delete fragment outputs that are beyond key.nr_color_regions; this is a lot easier if we lower outputs at specialization time rather than link time. (Rationale added to commit message by Ken) Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Set dest type to UW for several send messagesJordan Justen2016-02-262-2/+5
| | | | | | | | | | | | | | Without this, on SIMD 16 the send instruction destination will appear to write more than one destination register, causing the simulator to report an error. Of course, the send instruction can actually write more than one destination register regardless of the type set for the destination, so this is a bit strange. Suggested-by: Kenneth Graunke <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* nvc0: rework nvc0_compute_validate_program()Samuel Pitoiset2016-02-266-44/+20
| | | | | | | | | | Reduce the amount of duplicated code by re-using nvc0_program_validate(). While we are at it, change the prototype to return void and remove nvc0_compute.h which is now useless. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Pierre Moreau <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: make sure to validate compute global buffers on FermiSamuel Pitoiset2016-02-261-1/+3
| | | | | | | | | No reason to not validate those global buffers and this might avoid fails if someone try to use the global memory from compute programs. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Pierre Moreau <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: move nvc0_validate_global_residents() to nvc0_compute.cSamuel Pitoiset2016-02-264-19/+17
| | | | | | | | | While we are at it, rename it to nvc0_compute_validate_globals() and update its prototype. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Pierre Moreau <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* egl/wayland: Try to use wl_surface.damage_buffer for SwapBuffersWithDamageDerek Foreman2016-02-261-3/+36
| | | | | | | | | | | | | | | | | Since commit d1314de293e9e4a63c35f094c3893aaaed8580b4 we ignore damage passed to SwapBuffersWithDamage. Wayland 1.10 now has functionality that allows us to properly process those damage rectangles, and a way to query if it's available. Now we can use wl_surface.damage_buffer and interpret the incoming damage as being in buffer co-ordinates. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Pekka Paalanen <[email protected]> Signed-off-by: Derek Foreman <[email protected]>
* virgl: add missing CAP turned off.Dave Airlie2016-02-261-0/+3
|
* program: Remove extra reference_program()Miklós Máté2016-02-251-2/+0
| | | | | | It was already done in get_mesa_program() Signed-off-by: Marek Olšák <[email protected]>
* automake: add nine to make distcheckEmil Velikov2016-02-251-0/+1
| | | | | | | Will allow us to catch/prevent issues, like the one in mesa 11.2.0-rc1. Cc: "11.1 11.2" <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* st/nine: don't forget to bundle the nine_limits.h fileEmil Velikov2016-02-251-0/+1
| | | | | | | | Without this mesa 11.2.0-rc1 ended up busted :-( Cc: "11.2" <[email protected]> Repored-by: Ondřej Súkup <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* i965/fs: Allow saturate propagation to propagate negations into MADs.Matt Turner2016-02-251-0/+4
| | | | | | | | | | | | | | | | Allows us to transform mad res src0 src1 src2 mov.sat dst -res into mad.sat dst -src0 -src1 src2 instructions in affected programs: 3712 -> 3688 (-0.65%) helped: 24 Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Allow saturate propagation to propagate negations into ADDs.Matt Turner2016-02-252-4/+52
| | | | | | | | | | | | | | | Allows us to transform add res src0 src1 mov.sat dst -res into add.sat dst -src0 -src1 No shader-db changes. Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Allow saturate propagation to propagate negations into MULs.Matt Turner2016-02-252-3/+137
| | | | | | | | | | | | | | | | Allows us to transform mul res src0 src1 mov.sat dst -res into mul.sat dst src0 -src1 instructions in affected programs: 45246 -> 45054 (-0.42%) helped: 162 Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Don't CSE negated multiplies with saturation.Matt Turner2016-02-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | It's not correct to CSE these multiplies mul.sat dst1, -a, b mul.sat dst2, a, b by emitting a negated MOV from dst1 to dst2: mul.sat dst1, -a, b mov dst2, -dst1 Take 2.0*2.0 for example. The first multiply would produce 0.0 and the second would produce 1.0. Fixes bad generated code in 18 to 22 shaders: instructions in affected programs: 432 -> 464 (7.41%) helped: 4 HURT: 18 Cc: [email protected] Reviewed-by: Ian Romanick <[email protected]>
* glsl: Consider ubo_load to be a horizontal operation.Matt Turner2016-02-251-0/+1
| | | | | | | | | | Unclear to me whether it actually is a horizontal operation that cannot be vectorized, but the fact that i965 generates the same code in either case makes me less interested in finding out. Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94199 Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/ast: Implicit conversion from double to float is not allowedAndres Gomez2016-02-251-4/+3
| | | | | | | | Also, renamed get_conversion_operation to avoid future misunderstandings. Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* gallium/radeon: return correct values for BE in r600_translate_colorswapOded Gabbay2016-02-251-4/+4
| | | | | | | | | | | | | | | Because I changed the swizzle check, I also need to adapt the return values for each check. It's basically almost the same as before, we just cross between STD and STD_REV, and cross between ALT and ALT_REV This fixes the rgba test in gl-1.0-readpixsanity (piglit) and also fixes tri-flat (mesa demos). Signed-off-by: Oded Gabbay <[email protected]> Cc: "11.1 11.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: remove duplicate define from enum pipe_formatOded Gabbay2016-02-251-1/+0
| | | | | | Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* glsl: Detect do-while-false loops and unroll themIan Romanick2016-02-241-4/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously loops like do { // ... } while (false); that did not have any other loop-branch instructions would not be unrolled. This is commonly used to wrap multiline preprocessor macros. This produces IR like (loop ( ... break )) Since limiting_terminator was NULL, the loop unroller would throw up its hands and say, "I don't know how many iterations. How can I unroll this?" We can detect this another way. If there is no limiting_terminator and the only loop-branch is a break as the last IR, there's only one iteration. On my very old checkout of shader-db, this removes a loop from Orbital Explorer, but it does not otherwise affect the shader. The loop removed is the one the compiler inserts surrounding the switch statement. This change does prevent some seriously bad code generation in some patches to meta shaders that I recently sent out for review. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965: Enable tiled mem_copy with sRGB-formatted resourcesNanley Chery2016-02-241-2/+6
| | | | | | | | | | | RGBA8 and BGRA8 unorm formats are compatible with the various mem_copy functions. Their sRGB counterparts are also compatible because they're also color-renderable (of importance when the specified resource is a readbuffer) and they share the same physical layout. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* mesa: replace for loop with bitshifting in supported_buffer_bitmask()Brian Paul2016-02-241-4/+1
| | | | | Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: updates some comments in buffers.cBrian Paul2016-02-241-3/+6
| | | | | Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: make _mesa_draw_buffers() staticBrian Paul2016-02-242-11/+7
| | | | | Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: make _mesa_draw_buffer() staticBrian Paul2016-02-242-9/+6
| | | | | Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: make _mesa_read_buffer() staticBrian Paul2016-02-242-10/+7
| | | | | | | Not called from any other file. Remove _mesa_ prefix and update comments. Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: move declaration of buffer var in handle_first_current()Brian Paul2016-02-241-2/+4
| | | | | | | Declare the var in the scopes where it's used. Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: use gl_buffer_index in a few placesBrian Paul2016-02-243-5/+6
| | | | | Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* st/mesa: remove useless break statementBrian Paul2016-02-241-1/+0
| | | | | Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* st/mesa: rename st_readpixels to st_ReadPixelsBrian Paul2016-02-241-2/+2
| | | | | | | To match the convention of other device driver functions. Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* st/mesa: fix frontbuffer glReadPixels regressionsBrian Paul2016-02-241-2/+11
| | | | | | | | | | | | | | | | | | | | | The change "mesa/readpix: Don't clip in _mesa_readpixels()" caused a few piglit regressions. The failing tests use glReadPixels to read from the front color buffer. The problem is we were trying to read from a non-existant front color buffer. The front color buffer is created on demand in st/mesa. Since the missing buffer bounds were effectively 0 x 0 the glReadPixels was totally clipped and returned early. The fix involves creating the real front color buffer when we're about to try reading from it. Tested with llvmpipe and VMware driver on Linux, Windows. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94253 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94254 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94257 Cc: [email protected] Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/radeon: Correctly translate colorswaps for big endianOded Gabbay2016-02-231-0/+11
| | | | | | | | | | | | | | | | | | | | | | The current code in r600_translate_colorswap uses the swizzle information to determine which colorswap to use. This works for BE & LE when the nr_channels is <4, but when nr_channels==4 (e.g. PIPE_FORMAT_A8R8G8B8_UNORM), this method can not be used for both BE and LE, because the swizzle info is the same for both of them. As a result, r600g doesn't support 24bit color formats, only 16bit, which forces the user to choose 16bit color in X server. This patch fixes this bug by separating the checks for LE and BE and adapting the swizzle conditions in the BE part of the checks. Tested on an Evergreen GPU (Cedar GL FirePro 2270) running inside POWER7 Big-Endian Machine. Signed-off-by: Oded Gabbay <[email protected]> CC: "11.2" "11.1" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: use sizeof on the correct typeThomas Hindoe Paaboel Andersen2016-02-231-1/+1
| | | | | | | | Before the luminance stride was based on the size of GL_FLOAT which is just the type constant (0x1406). Change it to use the size of GLfloat. Reviewed-by: Brian Paul <[email protected]>
* tgsi/scan: handle holes between VS inputs, assert-fail in other casesMarek Olšák2016-02-231-1/+9
| | | | | | | | | | | | | | | | "st/mesa: overhaul vertex setup for clearing, glDrawPixels, glBitmap" added a vertex shader declaring IN[0] and IN[2], but not IN[1]. Drivers relying on tgsi_shader_info can't handle holes in declarations, because tgsi_shader_info doesn't track that. This is just a quick workaround meant for stable that will work for vertex shaders. This fixes radeonsi DrawPixels and CopyPixels crashes. Cc: [email protected] Reviewed-by: Brian Paul <[email protected]>
* docs: Mark off GL_OES_shader_image_atomic as done.Francisco Jerez2016-02-222-1/+2
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* i965/fs: Return result of image atomic in a register of the expected type.Francisco Jerez2016-02-221-1/+1
| | | | | | | | So the result is of float type if we're implementing the float overload of imageAtomicExchange. This is the only back-end change required to support OES_shader_image_atomic AFAICT. Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: Implement the required built-in functions when OES_shader_image_atomic ↵Francisco Jerez2016-02-221-18/+43
| | | | | | | | | | | | | | | | | | | | | | is enabled. This is basically just the same atomic functions exposed by ARB_shader_image_load_store, with one exception: "highp float imageAtomicExchange( coherent IMAGE_PARAMS, float data);" There's no float atomic exchange overload in the original ARB_shader_image_load_store or GL 4.2, so this seems like new functionality that requires specific back-end support and a separate availability condition in the built-in function generator. v2: Move image availability predicate logic into a separate static function for clarity. Had to pull out the image_function_flags enum from the builtin_builder class for that to be possible. Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: Add usual extension boilerplate for OES_shader_image_atomic.Francisco Jerez2016-02-223-0/+6
| | | | | | v2: No need for extension enable bits (Ilia). Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: Add extension table entry for OES_shader_image_atomic.Francisco Jerez2016-02-221-0/+1
| | | | | | v2: No need for extension enable bits (Ilia). Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: rename 3d binding points to NVC0_BIND_3D_XXXSamuel Pitoiset2016-02-229-63/+64
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: rename 3d dirty flags to NVC0_NEW_3D_XXXSamuel Pitoiset2016-02-228-133/+133
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: prefix compute macros with _CP_ instead of _COMPUTE_Samuel Pitoiset2016-02-224-4/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: rename NVXX_COMPUTE to NVXX_CPSamuel Pitoiset2016-02-225-117/+117
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: rename nvc0_context::dirty to nvc0_context::dirty_3dSamuel Pitoiset2016-02-228-64/+64
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add missing emission of locked load predicateSamuel Pitoiset2016-02-221-0/+7
| | | | | | | | | Like unlocked store on shared memory, locked store can fail and the second dest which is a predicate must be emitted. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nvc0/ir: add ld lock/st unlock emission on GK104Samuel Pitoiset2016-02-221-10/+25
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>