summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* svga: return true for the PIPE_CAP_SM3 queryBrian Paul2013-11-071-1/+3
| | | | | | | | | | This just tells the state tracker to turn on the GL_ARB_shader_texture_lod extension. This simply allows the GLSL compiler to emit TXL and TXD instructions for both vertex and fragment shaders. We already support these opcodes in the svga driver. Though, the shadow2DGrad() Piglit tests are failing. Reviewed-by: José Fonseca <[email protected]>
* i965: Add an implementation of intel_miptree_map using streaming loads.Matt Turner2013-11-071-0/+85
| | | | | | | | | | Improves performance of RoboHornet's 2D Canvas toDataURL benchmark [http://www.robohornet.org/#e=canvastodataurl] by approximately 5x on Baytrail on ChromiumOS. Elapsed time drops by -81.4861% +/- 1.22619% (n=3 s=14.9105, confidence=95%). Reviewed-by: Chad Versace <[email protected]>
* mesa: Add a streaming load memcpy implementation.Matt Turner2013-11-073-1/+127
| | | | | | | Uses SSE 4.1's MOVNTDQA instruction (streaming load) to read from uncached memory without polluting the cache. Reviewed-by: Chad Versace <[email protected]>
* i965: Fix 'SIMD16 only' dispatch of fragment shader in case of sample shadingAnuj Phogat2013-11-072-14/+25
| | | | | | | | | | | This patch make changes to correctly set up the Dispatch GRF Start Register in case of 'SIMD16 only' FS dispatch. This fixes an issue of incorrect rendering on dolphin emulator with GL_SAMPLE_SHADING enabled. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Enable ARB_vertex_type_10f_11f_11f_rev on Gen6+.Chris Forbes2013-11-081-0/+1
| | | | | | | | This theoretically works on earlier hardware as well, but the extension requires at least GL3.0. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: add support for UNSIGNED_INT_10F_11F_11F_REV vertex attribsChris Forbes2013-11-081-0/+2
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vbo: add 10_11_11 support to vbo_attrib_tmpChris Forbes2013-11-081-6/+26
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Add support to _mesa_bytes_per_vertex_attrib for 10_11_11 format.Chris Forbes2013-11-081-0/+5
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: add varray support for UNSIGNED_INT_10F_11F_11F_REV typeChris Forbes2013-11-081-3/+17
| | | | | | | | V2: fix interaction with VertexAttribFormat, since that landed after this was originally written Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Add extension scaffolding for ARB_vertex_type_10f_11f_11f_revChris Forbes2013-11-082-0/+2
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* draw,llvmpipe,util: add depth bias calculation for arb_depth_buffer_floatMatthew McClure2013-11-0714-56/+182
| | | | | | | | | | | | | | | With this patch, the llvmpipe and draw modules will calculate the depth bias according to floating point depth buffer semantics described in the arb_depth_buffer_float specification, when the driver has a z buffer bound with a format type of UTIL_FORMAT_TYPE_FLOAT. By default, the driver will use the existing UNORM calculation for depth bias. A new function, draw_set_zs_format, was added to calculate the Minimum Resolvable Depth value and floating point depth sense for the draw module. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* i965: Avoid flushing the batch for every blorp op.Eric Anholt2013-11-074-17/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | This brings over the batch-wrap-prevention and aperture space checking code from the normal brw_draw.c path, so that we don't need to flush the batch every time. There's a risk here if the intel_emit_post_sync_nonzero_flush() call isn't high enough up in the state emit sequences -- before, we implicitly had one at the batch flush before any state was emitted, so Mesa's workaround emits didn't really matter. Since the SNB fixes by Ken, I didn't see any regressions after 3 piglit runs. Improves cairo-gl performance by 13.7733% +/- 1.74876% (n=30/32) Improves minecraft apitrace performance by 1.03183% +/- 0.482297% (n=90). Reduces low-resolution GLB 2.7 performance by 1.17553% +/- 0.432263% (n=88) Reduces Lightsmark performance by 3.70246% +/- 0.322432% (n=126) No statistically significant performance difference on unigine tropics (n=10) No statistically significant performance difference on openarena (n=755) The two apps that are hurt happen to include stalls on busy buffer objects, so I think this is an effect of missing out on an opportune flush. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* build: Build gen_matypes and matypes.h from src/mesa.Matt Turner2013-11-075-103/+15
| | | | Reviewed-by: Eric Anholt <[email protected]>
* build: Change HAVE_X86_ASM to mean x86 or x86-64 asm.Matt Turner2013-11-074-8/+12
| | | | | | | I want a conditional that says generally "we have x86 assembly" in the next patch. Reviewed-by: Eric Anholt <[email protected]>
* mesa: Enable ARB_vertex_attrib_bindingFredrik Höglund2013-11-071-0/+1
| | | | Reviewed-by: Eric Anholt <[email protected]>
* mesa: Optimize rebinding the same VBOFredrik Höglund2013-11-071-2/+5
| | | | | | | Check if the new buffer object has the same name as the current buffer object before looking it up. Reviewed-by: Eric Anholt <[email protected]>
* mesa: Handle zero-stride arrays in _mesa_update_array_max_element()Fredrik Höglund2013-11-071-2/+4
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add Get* support for ARB_vertex_attrib_bindingFredrik Höglund2013-11-073-0/+38
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add ARB_vertex_attrib_bindingFredrik Höglund2013-11-0716-125/+691
| | | | | | | update_array() and update_array_format() are changed to update the new attrib and binding states, and the client arrays become derived state. Reviewed-by: Eric Anholt <[email protected]>
* glapi: Add infrastructure for ARB_vertex_attrib_bindingFredrik Höglund2013-11-076-7/+136
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Make handle_bind_buffer_gen() non-staticFredrik Höglund2013-11-072-11/+22
| | | | | | | | | | | | | | ...and rename it to _mesa_bind_buffer_gen(). This is so the function can be called from _mesa_BindVertexBuffer(). This patch also adds a caller parameter so we can report the right entry point in error messages. Based on a patch by Eric Anholt. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Rename gl_array_object::VertexAttrib to _VertexAttribFredrik Höglund2013-11-0713-134/+134
| | | | | | | | This will become derived state as part of the ARB_vertex_attrib_binding support. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Split out the format code from update_array()Fredrik Höglund2013-11-071-57/+93
| | | | | | | | | | Split out the code for updating the array format into a new function called update_array_format(). This function will be called by both update_array() and the new glVertexAttrib*Format() entry points in ARB_vertex_attrib_binding. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Restore gl_array_object::NewArrayFredrik Höglund2013-11-074-0/+10
| | | | | | | | This will be used by the ARB_vertex_attrib_binding implementation. This reverts commit db38e9a0e179441f59274f6f2a751912c29872e2. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Use has_surface_tile_offset in depth/stencil alignment workaround.Kenneth Graunke2013-11-071-2/+2
| | | | | | | | | | | | | Currently, has_surface_tile_offset is equivalent to gen == 4 && !is_g4x. We already use it for related checks in brw_wm_surface_state.c, so it makes sense to use it here too. It's simpler and more future-proof. Broadwell also lacks surface tile offsets. With this patch, I won't need to update any generation checking; I can simply not set the flag. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gallium: fix build on GNU/kFreeBSDFabio Pedretti2013-11-062-2/+2
| | | | | | | Patch from Debian package Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Andreas Boll <[email protected]>
* mesa: add arm64 supportFabio Pedretti2013-11-061-1/+1
| | | | | | | Patch from Ubuntu package Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Andreas Boll <[email protected]>
* r600/compute: silence unused var warningFabio Pedretti2013-11-061-1/+0
| | | | Reviewed-by: Marek Olšák <[email protected]>
* i965/gen6: Don't allow SIMD16 dispatch in 4x PERPIXEL mode with computed depth.Paul Berry2013-11-061-1/+33
| | | | | | | Hardware docs say we can only use SIMD8 dispatch in this condition. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* mesa: Build program as part of libmesa.Matt Turner2013-11-062-53/+18
|
* mesa: Clean up use of top_srcdir/top_builddir.Matt Turner2013-11-061-11/+4
|
* i965: Use unreachable() to silence a compiler warning.Matt Turner2013-11-061-0/+1
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* mesa: Add unreachable() macro.Matt Turner2013-11-061-0/+15
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* gallivm: fix indirect addressing of inputsRoland Scheidegger2013-11-061-17/+28
| | | | | | | | | | | | We weren't adding the soa offsets when constructing the indices for the gather functions. That meant that we were always returning the data in the first element. (Copied straight from the same fix for temps.) While here fix up a couple of broken comments in the fetch functions, plus don't name a straight float type float4 which is just confusing. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* r600/llvm: Fix isampleBuffer on preEGVincent Lejeune2013-11-061-1/+14
|
* r600/llvm: Fix texbuf for pre EG genVincent Lejeune2013-11-061-0/+29
|
* mesa: for GLSL_DUMP_ON_ERROR, also dump the info logBrian Paul2013-11-061-0/+2
| | | | | | | Since it's helpful to know why the shader did not compile. Also, call fflush() for Windows. Reviewed-by: José Fonseca <[email protected]>
* st/vdpau: resolve delayed rendering for GL interop v2Grigori Goronzy2013-11-061-0/+4
| | | | | | | | | Otherwise OutputSurface interop has funny results sometimes. This fixes interop with the mpv media player. v2 (chk): add proper locking Signed-off-by: Christian König <[email protected]>
* i965/fs: Gen4-5: Implement alpha test in shader for MRTChris Forbes2013-11-063-0/+58
| | | | | | | | V2: Add comment explaining what emit_alpha_test() is for; fix spurious temp and bogus whitespace. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Gen4-5: Setup discard masks for MRT alpha testChris Forbes2013-11-062-2/+2
| | | | | | | | The same setup is required here as when the user-provided shader explicitly uses KIL or discard. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Gen4-5: Include alpha func/ref in program keyChris Forbes2013-11-062-0/+18
| | | | | | | V2: Better explanation of the rationale for doing this. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Gen4-5: Don't enable hardware alpha test with MRTChris Forbes2013-11-061-1/+2
| | | | | | | | We have to do this in the shader instead, since these gens lack an independent RT0 alpha value in their render target write messages. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Combine {brw,gen7}_update_texture_buffer_surface() functions.Kenneth Graunke2013-11-053-40/+5
| | | | | | | | Now that brw_update_texture_buffer_surface() uses the virtual emit_buffer_surface_state() function, it works for Gen7+ too. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Unvirtualize brw_create_constant_surface; delete Gen7+ variant.Kenneth Graunke2013-11-054-45/+17
| | | | | | | | | Now that brw_create_constant_surface uses a virtual function internally, it doesn't need to be virtual itself. We can delete the Gen7+ variant and simplify things. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Use the new emit_buffer_surface_state() vtable entry.Kenneth Graunke2013-11-051-10/+10
| | | | | | | | This will allow us to combine the Gen4-6 and Gen7 variants of these functions. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Virtualize emit_buffer_surface_state().Kenneth Graunke2013-11-053-4/+20
| | | | | | | | | This entails adding "mocs" and "rw" parameters to the Gen4-5 version. I made it actually pay attention to the rw flag (even though it is always false), but mocs is always ignored. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Fix compiler warning.Courtney Goeltzenleuchter2013-11-052-2/+2
| | | | | | | fix: intel_screen.c:1320:4: warning: initialization from incompatible pointer type [enabled by default] Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Tell the unit states how many binding table entries we have.Eric Anholt2013-11-057-5/+22
| | | | | | | | | | Before the series with 3c9dc2d31b80fc73bffa1f40a91443a53229c8e2 to dynamically assign our binding table indices, we didn't really track our binding table count per shader, so we never filled in these fields. Affects cairo-gl trace runtime by -2.47953% +/- 1.07281% (n=20) Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix context initialization after 2f896627175384fd5Eric Anholt2013-11-051-3/+6
| | | | | | | | You can't return stack-initialized values and expect anything good to happen. Reviewed-by: Chad Versace <[email protected] Reviewed-by: Matt Turner <[email protected]>
* gallivm: optimize lp_build_minify for sseRoland Scheidegger2013-11-053-13/+54
| | | | | | | | | | | | | | | | | SSE can't handle true vector shifts (with variable shift count), so llvm is turning them into a mess of extracts, scalar shifts and inserts. It is however possible to emulate them in lp_build_minify with float muls, which should be way faster (saves over 20 instructions per 8-wide lp_build_minify). This wouldn't work for "generic" 32bit shifts though since we've got only 24bits of mantissa (actually for left shifts it would work by using sse41 int mul instead of float mul but not for right shifts). Note that this has very limited scope for now, since this is only used with per-pixel lod (otherwise we're avoiding the non-constant shift count by doing per-quad shifts manually), and only 1d textures even then (though the latter should change). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>