summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* gallivm: clear Altivec NJ bitAdhemerval Zanella2012-11-291-0/+19
| | | | | | | | This patch enforces the clear of NJ bit in VSCR Altivec register so denormal numbers are handles as expected by IEEE standards. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Altivec floating-point roundingAdhemerval Zanella2012-11-291-23/+70
| | | | | | | | | This patch adds Altivec intrinsics for float vector types. It changes the SSE specific definitions to a platform neutral and adds the calls to Altivec intrinsic builder. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Altivec vector add/sub intrisicsAdhemerval Zanella2012-11-292-15/+27
| | | | | | | | | | | | | This patch add correct vector addition and substraction intrisics when using Altivec with PPC. Current code uses default path and LLVM backend ends up issuing carry-out arithmetic instruction while it is expected saturated ones. It also includes a fix for PowerPC where char are unsigned by default, resulting in bogus values for vector shifting. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Altivec vector max/min intrisicsAdhemerval Zanella2012-11-291-2/+54
| | | | | | | | This patch adds the PPC Altivec instrics max/min instruction for supported Altivec vector types (16xi8, 8xi16, 4xi32, 4xf32). Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Altivec pack/unpack intrisicsAdhemerval Zanella2012-11-291-14/+30
| | | | | | | | This patch adds PPC Altivec support for pack/unpack operations using Altivec supported vector type (8xi8, 16xi16, 4xi32, 4xf32). Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* radeonsi: Bitcast result of packf16 intrinsic to float for export intrinsic.Michel Dänzer2012-11-291-1/+5
| | | | | | | Fixes 7 piglit tests, and prevents many more from crashing. Signed-off-by: Michel Dänzer <[email protected]> Reviewed-and-Tested-by: Christian König <[email protected]>
* i965/vs: Move struct brw_compile (p) entirely inside vec4_generator.Kenneth Graunke2012-11-283-4/+3
| | | | | | | | | | | | | | The brw_compile structure contains the brw_instruction store and the brw_eu_emit.c state tracking fields. These are only useful for the final assembly generation pass; the earlier compilation stages doesn't need them. This also means that the code generator for future hardware won't have access to the brw_compile structure, which is extremely desirable because it prevents accidental generation of Gen4-7 code. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Split final assembly code generation out of vec4_visitor.Kenneth Graunke2012-11-284-53/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compiling shaders requires several main steps: 1. Generating VS IR from either GLSL IR or Mesa IR 2. Optimizing the IR 3. Register allocation 4. Generating assembly code This patch splits out step 4 into a separate class named "vec4_generator." There are several reasons for doing so: 1. Future hardware has a different instruction encoding. Splitting this out will allow us to replace vec4_generator (which relies heavily on the brw_eu_emit.c code and struct brw_instruction) with a new code generator that writes the new format. 2. It reduces the size of the vec4_visitor monolith. (Arguably, a lot more should be split out, but that's left for "future work.") 3. Separate namespaces allow us to make helper functions for generating instructions in both classes: ADD() can exist in vec4_visitor and create IR, while ADD() in vec4_generator() can create brw_instructions. (Patches for this upcoming.) Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Abort on unsupported opcodes rather than failing.Kenneth Graunke2012-11-281-3/+4
| | | | | | | | | | Final code generation should never fail. This is a bug, and there should be no user-triggerable cases where this could occur. Also, we're not going to have a fail() method after the split. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Move uses of brw_compile from do_vs_prog to brw_vs_emit.Kenneth Graunke2012-11-283-14/+19
| | | | | | | | | | | | The brw_compile structure is closely tied to the Gen4-7 hardware encoding. However, do_vs_prog is very generic: it just calls out to get a compiled program and then uploads it. This isn't ultimately where we want it, but it's a step in the right direction: it's now closer to the code generator. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Rework memory contexts for shader compilation data.Kenneth Graunke2012-11-285-8/+12
| | | | | | | | | | | | | | During compilation, we allocate a bunch of things: the IR needs to last at least until code generation...and then the program store needs to last until after we upload the program. For simplicity's sake, just keep it all around until we upload the program. After that, it can all be freed. This will also save a lot of headaches during the upcoming refactoring. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Pass the brw_context pointer into brw_compute_vue_map().Kenneth Graunke2012-11-281-3/+2
| | | | | | | | We used to steal it out of the brw_compile struct, but that won't be initialized in time soon (and is eventually going away). Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Pass the brw_context pointer into vec4_visitor and do_vs_prog.Kenneth Graunke2012-11-285-9/+14
| | | | | | | | We used to steal it out of the brw_compile struct...but vec4_visitor isn't going to have one of those in the future. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Move some functions from brw_vec4_emit.cpp to brw_vec4.cpp.Kenneth Graunke2012-11-282-263/+265
| | | | | | | | | | | This leaves only the final code generation stage in brw_vec4_emit.cpp, moving the payload setup, run(), and brw_vs_emit functions to brw_vec4.cpp. The fragment shader backend puts these functions in brw_fs.cpp, so this patch also helps with consistency. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* meta: Don't try to glOrtho when the draw buffer isn't initialized.Kenneth Graunke2012-11-281-3/+5
| | | | | | | | | | | | | | | I ran across this while running a glGenerateMipmap() test. _meta_GenerateMipmap sets MESA_META_TRANSFORM, which causes _mesa_meta_begin to try and set a default orthographic projection. Unfortunately, if the drawbuffer isn't set up, ctx->DrawBuffer->Width and Height are 0, which just causes an GL_INVALID_VALUE error. Fixes oglconform's fbo/mipmap.automatic, mipmap.manual, and mipmap.manualIterateTexTargets. Reviewed-by: Brian Paul <[email protected]>
* st/mesa: allow forward-compatible contexts and set Const.ContextFlagsMarek Olšák2012-11-292-7/+8
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* st/mesa: add support for GL core profilesMarek Olšák2012-11-292-1/+6
| | | | | | | | | | | The rest of the plumbing was in place already. I have tested this by turning on all GL 3.1 features. The drivers not supporting GL 3.1 will fail to create a core profile as they should. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* util: add more memory debugging featuresBrian Paul2012-11-282-1/+84
| | | | | | | | Add a DEBUG_FREED_MEMORY option to help catch use-after-free errors. Add debug_memory_check() function which can be periodically called to check that all known blocks are good. Reviewed-by: José Fonseca <[email protected]>
* llvmpipe: Implement logic ops for the AoS path.José Fonseca2012-11-281-1/+8
| | | | | | | It was forgotten in the previous patch series, but it is trivial to implement, based on the SoA path. This fixes glean logicOp failures.
* llvmpipe: Don't use dynamically sized arrays.José Fonseca2012-11-281-4/+4
| | | | | Unfortunately for MSVC arrays with a constant variable size are still considered dynamically sized.
* i965/gen4-5: Fix segfaults with stencil-only depth/stencil setups.Eric Anholt2012-11-281-1/+3
| | | | | | | | Fixes a ton of piglit regressions since the depthstencil fixes for gen6+. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57309 Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Don't generate saturates over existing variable values.Eric Anholt2012-11-281-0/+1
| | | | | | | | | | | | Fixes a crash in http://workshop.chromeexperiments.com/stars/ on i965, and the new piglit test glsl-fs-clamp-5. We were trying to emit a saturating move into a uniform, which the code generator appropriately choked on. This was broken in the change in 32ae8d3b321185a85b73ff703d8fc26bd5f48fa7. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57166 NOTE: This is a candidate for the 9.0 branch. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add some minimal backend-IR dumping.Eric Anholt2012-11-282-0/+92
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* llvmpipe: Update llvmpipe_is_format_unswizzled to reflect latest changes.James Benton2012-11-281-9/+0
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: Enable vertex color clamping.James Benton2012-11-281-1/+1
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: Unswizzled rendering.James Benton2012-11-2827-157/+1782
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Updated lp_build_const_mask_aos to input number of channels.James Benton2012-11-287-20/+31
| | | | | | Also updated lp_build_const_mask_aos_swizzled to reflect this. Reviewed-by: Jose Fonseca <[email protected]>
* util: Updated util_format_is_array to be more accurate.James Benton2012-11-282-3/+16
| | | | | | | Will allow formats with padding, e.g. RGBX. Will now allow swizzled formats as long as the alpha is channel 3. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Added support for float to half-float conversion in lp_build_conv.James Benton2012-11-282-7/+94
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Changed lp_build_pad_vector to correctly handle scalar argument.James Benton2012-11-283-17/+22
| | | | | | Removed the lp_type argument as it was unnecessary. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Add a function to generate lp_type for a format.James Benton2012-11-282-7/+31
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Add support for unorm16 in lp_build_mul.James Benton2012-11-281-0/+45
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* glcpp: Support #elif(expression) with no intervening space.Matt Turner2012-11-287-1/+96
| | | | | | | | | | | | And add test cases to ensure that this works - 110 verifies that glcpp rejects #elif<digits> which glcpp previously accepted. - 111 verifies that glcpp accepts #if followed immediately by (, +, -, !, or ~. - 112 does the same as 111 but for #elif. See 17f9beb6 for #if change. Reviewed-by: Carl Worth <[email protected]>
* glcpp: Reject #version and #line not followed by whitespaceMatt Turner2012-11-285-2/+8
| | | | | Fixes part of es3conform's preprocess16_frag test. Reviewed-by: Carl Worth <[email protected]>
* mesa: fix BlitFramebuffer between linear and sRGB formatsMarek Olšák2012-11-281-3/+39
| | | | | | NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <[email protected]>
* gallivm: fix multiple lods with different min/mag filter and wide vectorsRoland Scheidegger2012-11-281-0/+3
| | | | | | broken since 529fe420ba6836479619ba42e53665724755fc1c, I forgot some code, only added the comment... Fixes bug 57644.
* radeonsi: Reinstate assertions against invalid colour/depth formats.Michel Dänzer2012-11-281-0/+2
| | | | | | | | | | | radeonsi now supports Z16 and doesn't fail these assertions anymore. This partially reverts commit 7bba4879bb79719e22a18b52759b1d1d839c783c, but leaves the error messages in place to allow diagnosing such problems even with non-debugging builds. Signed-off-by: Michel Dänzer <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeonsi: Re-enable Z16 depth buffers.Michel Dänzer2012-11-281-2/+2
| | | | | | | 8 more piglits. Signed-off-by: Michel Dänzer <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeonsi: remove redundant parameter in r600_init_surfaceMarek Olšák2012-11-281-5/+4
| | | | [ Cherry-picked from r600g commit f5ac60152b10b04d38e77db6b904dd50d1a54d6c ]
* radeonsi: Use explicit stencil mipmap level offsets.Michel Dänzer2012-11-282-7/+6
| | | | | | Extracted from r600g commit 428e37c2da420f7dc14a2ea265f2387270f9bee1. Signed-off-by: Michel Dänzer <[email protected]>
* radeonsi: correct texture memory size for Z32F_S8X24Marek Olšák2012-11-281-7/+15
| | | | [ Cherry-picked from r600g commit ea72351a919c594e7f40e901dca42aebb866f8a6 ]
* radeonsi: Depth/stencil fixes.Michel Dänzer2012-11-282-8/+21
| | | | | | Adapted from r600g commit 018e3f75d69490598d61059ece56d379867f3995. Signed-off-by: Michel Dänzer <[email protected]>
* radeonsi: Flesh out support for depth/stencil exports from the pixel shader.Michel Dänzer2012-11-282-6/+68
| | | | Signed-off-by: Michel Dänzer <[email protected]>
* radeonsi: Fix sampler views for depth textures.Michel Dänzer2012-11-282-5/+6
| | | | | | | Consistently reference the flushed depth texture in the sampler view, not the original one. Signed-off-by: Michel Dänzer <[email protected]>
* radeonsi: Fix z/stencil texture creation.Jerome Glisse2012-11-281-9/+5
| | | | | | Signed-off-by: Jerome Glisse <[email protected]> [ Cherry-picked from r600g commit b4f0ab0b22625ac1bb3cf16342039557c086ebae ]
* scons: Build ws_xlib on Mac OS X.Vinson Lee2012-11-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes this SCons build error on Mac OS X if X11 is found. NameError: name 'ws_xlib' is not defined: File "SConstruct", line 144: duplicate = 0 # http://www.scons.org/doc/0.97/HTML/scons-user/x2261.html File "scons-2.2.0/SCons/Script/SConscript.py", line 614: return method(*args, **kw) File "scons-2.2.0/SCons/Script/SConscript.py", line 551: return _SConscript(self.fs, *files, **subst_kw) File "scons-2.2.0/SCons/Script/SConscript.py", line 260: exec _file_ in call_stack[-1].globals File "src/SConscript", line 34: SConscript('gallium/SConscript') File "scons-2.2.0/SCons/Script/SConscript.py", line 614: return method(*args, **kw) File "scons-2.2.0/SCons/Script/SConscript.py", line 551: return _SConscript(self.fs, *files, **subst_kw) File "scons-2.2.0/SCons/Script/SConscript.py", line 260: exec _file_ in call_stack[-1].globals File "src/gallium/SConscript", line 135: 'targets/libgl-xlib/SConscript', File "scons-2.2.0/SCons/Script/SConscript.py", line 614: return method(*args, **kw) File "scons-2.2.0/SCons/Script/SConscript.py", line 551: return _SConscript(self.fs, *files, **subst_kw) File "scons-2.2.0/SCons/Script/SConscript.py", line 260: exec _file_ in call_stack[-1].globals File "src/gallium/targets/graw-xlib/SConscript", line 9: ws_xlib, Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* vbo: move another line of code after declarationsBrian Paul2012-11-271-1/+1
| | | | Signed-off-by: Brian Paul <[email protected]>
* vbo: move code after declarations to fix MSVC errorsBrian Paul2012-11-271-7/+7
| | | | Reviewed-by: Ian Romanick <[email protected]>
* vbo: minor whitespace fixBrian Paul2012-11-271-1/+1
|
* mesa: remove '(void) k' linesBrian Paul2012-11-271-4/+0
| | | | Serves no purpose as the k parameter is used later in the code.