mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	intel: Enable ETC2 support on intel hardware	Anuj Phogat	2012-12-07	3	-50/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch enables support for ETC2 compressed textures on all intel hardware. At present, ETC2 texture decoding is not available on intel hardware. So, compressed ETC2 texture data is decoded in software and stored in a suitable uncompressed MESA_FORMAT at the time of glCompressedTexImage2D. Currently, ETC2 formats are only exposed in OpenGL ES 3.0. V2: Use single etc_wraps variable for both etc1 and etc2. V3: Remove redundant code and use just one intel_miptree_map_etc() and intel_miptree_unmap_etc() function. Choose MESA_FORMAT_SIGNED_{R16, GR1616} for ETC2 signed-{r11, rg11} formats Signed-off-by: Anuj Phogat <[email protected]> Tested-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	meta: Use #version 300 es for _mesa_glsl_Clear's integer shaders on ES3.	Kenneth Graunke	2012-12-07	1	-17/+27
\| \| \| \| \| \| \|	Fixes es3conform's color_buffer_float_clamp_(fixed\|on\|off) tests. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	meta: Use #version 300 es in GenerateMipmap shaders on ES3.	Kenneth Graunke	2012-12-07	1	-11/+13
\| \| \| \| \|	Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	meta: Disable GL_FRAGMENT_SHADER_ATI in MESA_META_SHADER	Stefan Dösinger	2012-12-06	1	-0/+11
\| \| \| \| \| \| \| \| \|	Fixes clears in Wine on r200. NOTE: This is a candidate for stable release branches. Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
*	radeon: Initialize swrast before setting limits	Stefan Dösinger	2012-12-06	1	-9/+9
\| \| \| \| \| \|	NOTE: This is a candidate for stable release branches. Signed-off-by: Alex Deucher <[email protected]>
*	r200: Initialize swrast before setting limits	Stefan Dösinger	2012-12-06	1	-10/+9
\| \| \| \| \| \| \| \| \|	Otherwise the driver announces 4096 vertex shader constants and other way too high limits. NOTE: This is a candidate for stable release branches. Signed-off-by: Alex Deucher <[email protected]>
*	i965: Add a debug flag for counting cycles spent in each compiled shader.	Eric Anholt	2012-12-05	17	-9/+524
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This can be used for two purposes: Using hand-coded shaders to determine per-instruction timings, or figuring out which shader to optimize in a whole application. Note that this doesn't cover the instructions that set up the message to the URB/FB write -- we'd need to convert the MRF usage in these instructions to GRFs so that our offsets/times don't overwrite our shader outputs. Reviewed-by: Kenneth Graunke <[email protected]> (v1) v2: Check the timestamp reset flag in the VS, which is apparently getting set fairly regularly in the range we watch, resulting in negative numbers getting added to our 32-bit counter, and thus large values added to our uint64_t. v3: Rebase on reladdr changes, removing a new safety check that proved impossible to satisfy. Add a comment to the AOP defs from Ken's review, and put them in a slightly more sensible spot. v4: Check timestamp reset in the FS as well.
*	i965: Add a flag for instructions with normal writemasking disabled.	Eric Anholt	2012-12-05	4	-0/+4
\| \| \| \| \|	For getting values from the new timestamp register, the channels we load have nothing to do with the pixels dispatched.
*	i965/fs: Add support for uniform array access with a variable index.	Eric Anholt	2012-12-04	4	-24/+216
\| \| \| \| \| \| \| \| \| \| \| \| \|	Serious Sam 3 had a shader hitting this path, but it's used rarely so it didn't show a significant performance difference (n=7). It does reduce compile time massively, though -- one shader goes from 14s compile time and 11723 instructions generated to .44s and 499 instructions. Note that some shaders lose 16-wide mode because we don't support 16-wide and pull constants at the moment (generally, things looping over a few-element array where the loop isn't getting unrolled). Given that those shaders are being generated with 15-20% fewer instructions, it probably outweighs the loss of 16-wide.
*	i965/fs: Conditionalize constant-index UBO load code and add comments.	Eric Anholt	2012-12-04	1	-28/+33
\| \| \| \| \|	I wanted to separate this step for easier reviewing when I add the variable-index case next.
*	i965/fs: Restrict optimization that would fail for gen7's SENDs from GRFs	Eric Anholt	2012-12-04	3	-8/+28
\| \| \| \| \| \|	v2: Fix SNB math bug in register_coalesce() where I was looking at the instruction to be removed, not the instruction to be copy propagated into.
*	i965/fs: Allow source mods on gen7+ math.	Eric Anholt	2012-12-04	1	-1/+1
\| \| \| \| \|	This gen6 restriction was removed in gen7 as the mathbox merge to act more like a normal instruction was finished in the hardware.
*	i965/fs: Add instruction emit for varying-index reads of uniforms.	Eric Anholt	2012-12-04	4	-0/+105
\| \| \| \| \| \| \|	The gen7 send-from-GRF path is sufficiently different from the perspective of IR generation and optimization that I just made it a separate opcode. v2: fix whitespace, rebase on Ken's recent refactor.
*	i965/fs: Rename the existing pull constant load opcode.	Eric Anholt	2012-12-04	6	-14/+16
\| \| \| \| \|	We're going to use another send message for handling loads with a varying per-fragment array index.
*	i965: Add a header_present flag for setting up dp read messages.	Eric Anholt	2012-12-04	3	-1/+7
\| \| \| \| \| \|	As of gen7, we can skip the header on some messages, and this can make optimization on those messages much nicer when you've got GRFs instead of MRFs as the source.
*	i965/gen7: Add some safety checks for send messages from GRFs.	Eric Anholt	2012-12-04	1	-0/+15
\|
*	intel: Always enable GL_ARB_framebuffer_object	Ian Romanick	2012-12-03	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that _mesa_BindFramebuffer does the right thing in ES contexts when the gl_extensions::ARB_framebuffer_object bit is set, the Intel driver doesn't need this hack. No piglit or GLES2 conformance regressions observed on IVB, and this patch (and the previous) fix es3conform's framebuffer_srgb_draw and transform_feedback_misc tests. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Allow INTEL_DEBUG=fs as a synonym for INTEL_DEBUG=wm.	Kenneth Graunke	2012-12-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	I keep accidentally trying to use it. "fs" is a sensible name for fragment shader debugging, and "wm" is...not. It's also more symmetric with "vs". Leave INTEL_DEBUG=wm because old habits die hard. Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965: Include codegen time in the INTEL_DEBUG=perf stall detection.	Eric Anholt	2012-12-03	2	-12/+18
\| \| \| \| \| \| \|	In the VS case, we were missing the entire compile time in the stall detection! Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Don't leak the IR annotation into later instructions.	Eric Anholt	2012-12-03	2	-0/+2
\| \| \| \| \| \| \| \| \|	After walking our IR instructions (Mesa or GLSL), we don't want to also mark the start of the FB/URB writes or whatever as being that IR. This can end up being misleading when the end of the IR visit got copy propagated out to a later instruction in the URB writes. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/vp: Fix crashes with INTEL_DEBUG=vs.	Eric Anholt	2012-12-03	1	-0/+1
\| \| \| \| \| \| \|	The VP generation doesn't set up the output reg strings, so if you didn't happen to get these values as 0 on the stack, you'd lose. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/vs: Fix uninitialized shader pointer used in debug output.	Eric Anholt	2012-12-03	1	-0/+2
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	glx/dri2: add and use new driver hook flush_with_flags	Marek Olšák	2012-12-02	3	-3/+3
\|
*	radeon: Fix memory leak in radeonCreateScreen2.	Vinson Lee	2012-11-30	1	-1/+3
\| \| \| \| \| \| \|	Fixes a memory leak defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	nouveau: Fix build.	Brian Paul	2012-11-30	1	-1/+1
\| \| \| \| \| \| \| \|	Fixes nouveau build failure introduced at c73245882c7ff1277b190b97f093f7b423a22f10. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57746 Signed-off-by: Vinson Lee <[email protected]>
*	i965/fs: Add fs_reg::is_zero() and is_one(); use for opt_algebraic().	Kenneth Graunke	2012-11-30	2	-7/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These helper macros save you from writing nasty expressions like: if ((inst->src[1].type == BRW_REGISTER_TYPE_F && inst->src[1].imm.f == 1.0) \|\| ((inst->src[1].type == BRW_REGISTER_TYPE_D \|\| inst->src[1].type == BRW_REGISTER_TYPE_UD) && inst->src[1].imm.u == 1)) { Instead, you simply get to write inst->src[1].is_one(). Simple. Also, this makes the FS backend match the VS backend (which has these). This patch also converts opt_algebraic to use the new helper functions. As a consequence, it will now also optimize integer-typed expressions. Reviewed-by: Eric Anholt <[email protected]>
*	mesa: pass context parameter to gl_renderbuffer::Delete()	Brian Paul	2012-11-30	7	-13/+13
\| \| \| \| \| \| \| \| \| \|	We sometimes need a rendering context when deleting renderbuffers. Pass it explicitly instead of trying to grab a current context (which might be NULL). The next patch will make use of this. Note: this is a candidate for the stable branches. Reviewed-by: Jose Fonseca <[email protected]>
*	i965/fp: Fix segfault on gen4 TXB instructions.	Eric Anholt	2012-11-29	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	The gen4 simd16 workaround looks at ir->type to determine how much storage to allocate for the simd16 value. In fragment programs, texturing only ever returns float vec4s (unlike GLSL, which can also have scalar floats or vector integers), so this is the right type. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56962 Reviewed-by: Kenneth Graunke <[email protected]>
*	mesa: Fix GL_LUMINANCE handling for textures in glGetTexImage	Anuj Phogat	2012-11-29	1	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to rebase colors (ex: set G=B=0) when getting GL_LUMINANCE textures in following cases: 1. If the luminance texture is actually stored as rgba 2. If getting a luminance texture, but returning rgba 3. If getting an rgba texture, but returning luminance A similar fix was pushed by Brian Paul for uncompressed textures in commit: f5d0ced. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=47220 Observed no regressions in piglit and ogles2conform due to this fix. This patch will cause failures in intel oglconform pxconv-gettex, pxstore-gettex and pxtrans-gettex test cases. The cause of failures is a bug in test cases. Expected luminance value is calculted incorrectly in test cases: L = R+G+B. V2: Set G = 0 when getting a RG texture but returning luminance. Note: This is a candidate for stable branches. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	Revert "meta: Don't try to glOrtho when the draw buffer isn't initialized."	Kenneth Graunke	2012-11-29	1	-5/+3
\| \| \| \| \|	This reverts commit 9947470655bbf8f4a9c98fe6d93ff5c3486f1124. Apparently it caused a lot of Piglit regressions.
*	mesa: Rename API_OPENGL to API_OPENGL_COMPAT.	Paul Berry	2012-11-29	14	-33/+33
\| \| \| \| \| \| \| \| \| \|	This should help avoid confusion now that we're using the gl_api enum to distinguishing between core and compatibility API's. The corresponding enum value for core API's is API_OPENGL_CORE. Acked-by: Eric Anholt <[email protected]> Acked-by: Matt Turner <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965/vs: Move struct brw_compile (p) entirely inside vec4_generator.	Kenneth Graunke	2012-11-28	3	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The brw_compile structure contains the brw_instruction store and the brw_eu_emit.c state tracking fields. These are only useful for the final assembly generation pass; the earlier compilation stages doesn't need them. This also means that the code generator for future hardware won't have access to the brw_compile structure, which is extremely desirable because it prevents accidental generation of Gen4-7 code. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
*	i965/vs: Split final assembly code generation out of vec4_visitor.	Kenneth Graunke	2012-11-28	4	-53/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Compiling shaders requires several main steps: 1. Generating VS IR from either GLSL IR or Mesa IR 2. Optimizing the IR 3. Register allocation 4. Generating assembly code This patch splits out step 4 into a separate class named "vec4_generator." There are several reasons for doing so: 1. Future hardware has a different instruction encoding. Splitting this out will allow us to replace vec4_generator (which relies heavily on the brw_eu_emit.c code and struct brw_instruction) with a new code generator that writes the new format. 2. It reduces the size of the vec4_visitor monolith. (Arguably, a lot more should be split out, but that's left for "future work.") 3. Separate namespaces allow us to make helper functions for generating instructions in both classes: ADD() can exist in vec4_visitor and create IR, while ADD() in vec4_generator() can create brw_instructions. (Patches for this upcoming.) Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
*	i965/vs: Abort on unsupported opcodes rather than failing.	Kenneth Graunke	2012-11-28	1	-3/+4
\| \| \| \| \| \| \| \| \| \|	Final code generation should never fail. This is a bug, and there should be no user-triggerable cases where this could occur. Also, we're not going to have a fail() method after the split. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
*	i965/vs: Move uses of brw_compile from do_vs_prog to brw_vs_emit.	Kenneth Graunke	2012-11-28	3	-14/+19
\| \| \| \| \| \| \| \| \| \| \| \|	The brw_compile structure is closely tied to the Gen4-7 hardware encoding. However, do_vs_prog is very generic: it just calls out to get a compiled program and then uploads it. This isn't ultimately where we want it, but it's a step in the right direction: it's now closer to the code generator. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
*	i965/vs: Rework memory contexts for shader compilation data.	Kenneth Graunke	2012-11-28	5	-8/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	During compilation, we allocate a bunch of things: the IR needs to last at least until code generation...and then the program store needs to last until after we upload the program. For simplicity's sake, just keep it all around until we upload the program. After that, it can all be freed. This will also save a lot of headaches during the upcoming refactoring. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
*	i965/vs: Pass the brw_context pointer into brw_compute_vue_map().	Kenneth Graunke	2012-11-28	1	-3/+2
\| \| \| \| \| \| \| \|	We used to steal it out of the brw_compile struct, but that won't be initialized in time soon (and is eventually going away). Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
*	i965/vs: Pass the brw_context pointer into vec4_visitor and do_vs_prog.	Kenneth Graunke	2012-11-28	5	-9/+14
\| \| \| \| \| \| \| \|	We used to steal it out of the brw_compile struct...but vec4_visitor isn't going to have one of those in the future. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
*	i965/vs: Move some functions from brw_vec4_emit.cpp to brw_vec4.cpp.	Kenneth Graunke	2012-11-28	2	-263/+265
\| \| \| \| \| \| \| \| \| \| \|	This leaves only the final code generation stage in brw_vec4_emit.cpp, moving the payload setup, run(), and brw_vs_emit functions to brw_vec4.cpp. The fragment shader backend puts these functions in brw_fs.cpp, so this patch also helps with consistency. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
*	meta: Don't try to glOrtho when the draw buffer isn't initialized.	Kenneth Graunke	2012-11-28	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I ran across this while running a glGenerateMipmap() test. _meta_GenerateMipmap sets MESA_META_TRANSFORM, which causes _mesa_meta_begin to try and set a default orthographic projection. Unfortunately, if the drawbuffer isn't set up, ctx->DrawBuffer->Width and Height are 0, which just causes an GL_INVALID_VALUE error. Fixes oglconform's fbo/mipmap.automatic, mipmap.manual, and mipmap.manualIterateTexTargets. Reviewed-by: Brian Paul <[email protected]>
*	i965/gen4-5: Fix segfaults with stencil-only depth/stencil setups.	Eric Anholt	2012-11-28	1	-1/+3
\| \| \| \| \| \| \| \|	Fixes a ton of piglit regressions since the depthstencil fixes for gen6+. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57309 Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Don't generate saturates over existing variable values.	Eric Anholt	2012-11-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Fixes a crash in http://workshop.chromeexperiments.com/stars/ on i965, and the new piglit test glsl-fs-clamp-5. We were trying to emit a saturating move into a uniform, which the code generator appropriately choked on. This was broken in the change in 32ae8d3b321185a85b73ff703d8fc26bd5f48fa7. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57166 NOTE: This is a candidate for the 9.0 branch. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Add some minimal backend-IR dumping.	Eric Anholt	2012-11-28	2	-0/+92
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Move struct brw_compile (p) entirely inside fs_generator.	Kenneth Graunke	2012-11-26	6	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The brw_compile structure contains the brw_instruction store and the brw_eu_emit.c state tracking fields. These are only useful for the final assembly generation pass; the earlier compilation stages doesn't need them. This also means that the code generator for future hardware won't have access to the brw_compile structure, which is extremely desirable because it prevents accidental generation of Gen4-7 code. v2: rzalloc p, as suggested by Eric. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965/fs: Split final assembly code generation out of fs_visitor.	Kenneth Graunke	2012-11-26	3	-78/+156
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Compiling shaders requires several main steps: 1. Generating FS IR from either GLSL IR or Mesa IR 2. Optimizing the IR 3. Register allocation 4. Generating assembly code This patch splits out step 4 into a separate class named "fs_generator." There are several reasons for doing so: 1. Future hardware has a different instruction encoding. Splitting this out will allow us to replace fs_generator (which relies heavily on the brw_eu_emit.c code and struct brw_instruction) with a new code generator that writes the new format. 2. It reduces the size of the fs_visitor monolith. (Arguably, a lot more should be split out, but that's left for "future work.") 3. Separate namespaces allow us to make helper functions for generating instructions in both classes: ADD() can exist in fs_visitor and create IR, while ADD() in fs_generator() can create brw_instructions. (Patches for this upcoming.) Furthermore, this patch changes the order of operations slightly. Rather than doing steps 1-4 for SIMD8, then 1-4 for SIMD16, we now: - Do steps 1-3 for SIMD8, then repeat 1-3 for SIMD16 - Generate final assembly code for both modes together This is because the frontend work can be done independently, but final assembly generation needs to pack both into a single program store to feed the GPU. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965/fs: Abort on unsupported opcodes rather than failing.	Kenneth Graunke	2012-11-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Final code generation should never fail. This is a bug, and there should be no user-triggerable cases where this could occur. Also, we're not going to have a fail() method in a moment. v2: Just abort() rather than assert, to cover the NDEBUG case (suggested by Eric). Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965: Make it possible to create a cfg_t without a backend_visitor.	Kenneth Graunke	2012-11-26	2	-3/+18
\| \| \| \| \| \| \| \| \| \|	All we really need is a memory context and the instruction list; passing a backend_visitor is just convenient at times. This will be necessary two patches from now. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965/fs: Move uses of brw_compile from do_wm_prog to brw_wm_fs_emit.	Kenneth Graunke	2012-11-26	3	-14/+20
\| \| \| \| \| \| \| \| \| \| \| \|	The brw_compile structure is closely tied to the Gen4-7 hardware encoding. However, do_wm_prog is very generic: it just calls out to get a compiled program and then uploads it. This isn't ultimately where we want it, but it's a step in the right direction: it's now closer to the code generator. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965/fs: Pass the brw_context pointer into fs_visitor explicitly.	Kenneth Graunke	2012-11-26	3	-5/+7
\| \| \| \| \| \| \| \|	We used to steal it out of the brw_compile struct...but fs_visitor isn't going to have one of those in the future. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965/fs: Move brw_wm_compile::fp to fs_visitor.	Kenneth Graunke	2012-11-26	8	-17/+19
\| \| \| \| \| \| \| \|	Also change it from a brw_fragment_program to a gl_fragment_program, since that seems to be what everything wants anyway. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>