summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965/vs: Move struct brw_compile (p) entirely inside vec4_generator.Kenneth Graunke2012-11-283-4/+3
| | | | | | | | | | | | | | The brw_compile structure contains the brw_instruction store and the brw_eu_emit.c state tracking fields. These are only useful for the final assembly generation pass; the earlier compilation stages doesn't need them. This also means that the code generator for future hardware won't have access to the brw_compile structure, which is extremely desirable because it prevents accidental generation of Gen4-7 code. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Split final assembly code generation out of vec4_visitor.Kenneth Graunke2012-11-284-53/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compiling shaders requires several main steps: 1. Generating VS IR from either GLSL IR or Mesa IR 2. Optimizing the IR 3. Register allocation 4. Generating assembly code This patch splits out step 4 into a separate class named "vec4_generator." There are several reasons for doing so: 1. Future hardware has a different instruction encoding. Splitting this out will allow us to replace vec4_generator (which relies heavily on the brw_eu_emit.c code and struct brw_instruction) with a new code generator that writes the new format. 2. It reduces the size of the vec4_visitor monolith. (Arguably, a lot more should be split out, but that's left for "future work.") 3. Separate namespaces allow us to make helper functions for generating instructions in both classes: ADD() can exist in vec4_visitor and create IR, while ADD() in vec4_generator() can create brw_instructions. (Patches for this upcoming.) Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Abort on unsupported opcodes rather than failing.Kenneth Graunke2012-11-281-3/+4
| | | | | | | | | | Final code generation should never fail. This is a bug, and there should be no user-triggerable cases where this could occur. Also, we're not going to have a fail() method after the split. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Move uses of brw_compile from do_vs_prog to brw_vs_emit.Kenneth Graunke2012-11-283-14/+19
| | | | | | | | | | | | The brw_compile structure is closely tied to the Gen4-7 hardware encoding. However, do_vs_prog is very generic: it just calls out to get a compiled program and then uploads it. This isn't ultimately where we want it, but it's a step in the right direction: it's now closer to the code generator. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Rework memory contexts for shader compilation data.Kenneth Graunke2012-11-285-8/+12
| | | | | | | | | | | | | | During compilation, we allocate a bunch of things: the IR needs to last at least until code generation...and then the program store needs to last until after we upload the program. For simplicity's sake, just keep it all around until we upload the program. After that, it can all be freed. This will also save a lot of headaches during the upcoming refactoring. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Pass the brw_context pointer into brw_compute_vue_map().Kenneth Graunke2012-11-281-3/+2
| | | | | | | | We used to steal it out of the brw_compile struct, but that won't be initialized in time soon (and is eventually going away). Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Pass the brw_context pointer into vec4_visitor and do_vs_prog.Kenneth Graunke2012-11-285-9/+14
| | | | | | | | We used to steal it out of the brw_compile struct...but vec4_visitor isn't going to have one of those in the future. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/vs: Move some functions from brw_vec4_emit.cpp to brw_vec4.cpp.Kenneth Graunke2012-11-282-263/+265
| | | | | | | | | | | This leaves only the final code generation stage in brw_vec4_emit.cpp, moving the payload setup, run(), and brw_vs_emit functions to brw_vec4.cpp. The fragment shader backend puts these functions in brw_fs.cpp, so this patch also helps with consistency. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* meta: Don't try to glOrtho when the draw buffer isn't initialized.Kenneth Graunke2012-11-281-3/+5
| | | | | | | | | | | | | | | I ran across this while running a glGenerateMipmap() test. _meta_GenerateMipmap sets MESA_META_TRANSFORM, which causes _mesa_meta_begin to try and set a default orthographic projection. Unfortunately, if the drawbuffer isn't set up, ctx->DrawBuffer->Width and Height are 0, which just causes an GL_INVALID_VALUE error. Fixes oglconform's fbo/mipmap.automatic, mipmap.manual, and mipmap.manualIterateTexTargets. Reviewed-by: Brian Paul <[email protected]>
* st/mesa: allow forward-compatible contexts and set Const.ContextFlagsMarek Olšák2012-11-291-5/+7
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* st/mesa: add support for GL core profilesMarek Olšák2012-11-291-0/+3
| | | | | | | | | | | The rest of the plumbing was in place already. I have tested this by turning on all GL 3.1 features. The drivers not supporting GL 3.1 will fail to create a core profile as they should. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/gen4-5: Fix segfaults with stencil-only depth/stencil setups.Eric Anholt2012-11-281-1/+3
| | | | | | | | Fixes a ton of piglit regressions since the depthstencil fixes for gen6+. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57309 Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Don't generate saturates over existing variable values.Eric Anholt2012-11-281-0/+1
| | | | | | | | | | | | Fixes a crash in http://workshop.chromeexperiments.com/stars/ on i965, and the new piglit test glsl-fs-clamp-5. We were trying to emit a saturating move into a uniform, which the code generator appropriately choked on. This was broken in the change in 32ae8d3b321185a85b73ff703d8fc26bd5f48fa7. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57166 NOTE: This is a candidate for the 9.0 branch. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add some minimal backend-IR dumping.Eric Anholt2012-11-282-0/+92
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: fix BlitFramebuffer between linear and sRGB formatsMarek Olšák2012-11-281-3/+39
| | | | | | NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <[email protected]>
* vbo: move another line of code after declarationsBrian Paul2012-11-271-1/+1
| | | | Signed-off-by: Brian Paul <[email protected]>
* vbo: move code after declarations to fix MSVC errorsBrian Paul2012-11-271-7/+7
| | | | Reviewed-by: Ian Romanick <[email protected]>
* vbo: minor whitespace fixBrian Paul2012-11-271-1/+1
|
* mesa: remove '(void) k' linesBrian Paul2012-11-271-4/+0
| | | | Serves no purpose as the k parameter is used later in the code.
* mesa/vbo: Check for invalid types in various packed vertex functions.Kenneth Graunke2012-11-271-0/+43
| | | | | | | | | | | | | | According to the ARB_vertex_type_2_10_10_10_rev specification: "The error INVALID_ENUM is generated by VertexP*, NormalP*, TexCoordP*, MultiTexCoordP*, ColorP*, or SecondaryColorP if <type> is not UNSIGNED_INT_2_10_10_10_REV or INT_2_10_10_10_REV." Fixes 7 subcases of oglconform's packed-vertex test. v2: Add "gl" prefix to error messages (pointed out by Brian). Also rebase atop the ctx plumbing. Reviewed-by: Brian Paul <[email protected]>
* mesa/vbo: Support the ES 3.0 signed normalized scaling rules.Kenneth Graunke2012-11-271-2/+38
| | | | | | | | | | | | | | Traditionally, OpenGL has had two separate equations for converting from signed normalized fixed-point data to floating point data. One was used primarily for vertex data, while the other was primarily for texturing and framebuffer data. However, ES 3.0 and GL 4.2 change this, declaring there's only one equation to be used in all cases. Unfortunately, it's the other one. v2: Correctly convert 0b10 to -1.0, as pointed out by Chris Forbes. Reviewed-by: Chris Forbes <[email protected]>
* mesa/vbo: Plumb ctx through to the conv_i(10|2)_to_norm_float functions.Kenneth Graunke2012-11-271-59/+59
| | | | | | | | | The rules for converting these values actually depend on the current context API and version. The next patch will implement those changes. v2: Mark ctx as const, as suggested by Brian. Reviewed-by: Chris Forbes <[email protected]>
* mesa: Set transform feedback's default buffer mode to INTERLEAVED_ATTRIBSMatt Turner2012-11-271-0/+2
| | | | | | | Fixes part of es3conform's transform_feedback_init_defaults test. NOTE: This is a candidate for the stable branch. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Return 0 for XFB_VARYING_MAX_LENGTH if no varyingsMatt Turner2012-11-271-21/+15
| | | | | | | | | v2: Perform this count the same way as elsewhere in this file, per Brian Paul's review. Fixes part of es3conform's transform_feedback_init_defaults test. NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <[email protected]>
* i965/fs: Move struct brw_compile (p) entirely inside fs_generator.Kenneth Graunke2012-11-266-6/+4
| | | | | | | | | | | | | | | | The brw_compile structure contains the brw_instruction store and the brw_eu_emit.c state tracking fields. These are only useful for the final assembly generation pass; the earlier compilation stages doesn't need them. This also means that the code generator for future hardware won't have access to the brw_compile structure, which is extremely desirable because it prevents accidental generation of Gen4-7 code. v2: rzalloc p, as suggested by Eric. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Split final assembly code generation out of fs_visitor.Kenneth Graunke2012-11-263-78/+156
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compiling shaders requires several main steps: 1. Generating FS IR from either GLSL IR or Mesa IR 2. Optimizing the IR 3. Register allocation 4. Generating assembly code This patch splits out step 4 into a separate class named "fs_generator." There are several reasons for doing so: 1. Future hardware has a different instruction encoding. Splitting this out will allow us to replace fs_generator (which relies heavily on the brw_eu_emit.c code and struct brw_instruction) with a new code generator that writes the new format. 2. It reduces the size of the fs_visitor monolith. (Arguably, a lot more should be split out, but that's left for "future work.") 3. Separate namespaces allow us to make helper functions for generating instructions in both classes: ADD() can exist in fs_visitor and create IR, while ADD() in fs_generator() can create brw_instructions. (Patches for this upcoming.) Furthermore, this patch changes the order of operations slightly. Rather than doing steps 1-4 for SIMD8, then 1-4 for SIMD16, we now: - Do steps 1-3 for SIMD8, then repeat 1-3 for SIMD16 - Generate final assembly code for both modes together This is because the frontend work can be done independently, but final assembly generation needs to pack both into a single program store to feed the GPU. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Abort on unsupported opcodes rather than failing.Kenneth Graunke2012-11-261-1/+1
| | | | | | | | | | | | | Final code generation should never fail. This is a bug, and there should be no user-triggerable cases where this could occur. Also, we're not going to have a fail() method in a moment. v2: Just abort() rather than assert, to cover the NDEBUG case (suggested by Eric). Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Make it possible to create a cfg_t without a backend_visitor.Kenneth Graunke2012-11-262-3/+18
| | | | | | | | | | All we really need is a memory context and the instruction list; passing a backend_visitor is just convenient at times. This will be necessary two patches from now. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Move uses of brw_compile from do_wm_prog to brw_wm_fs_emit.Kenneth Graunke2012-11-263-14/+20
| | | | | | | | | | | | The brw_compile structure is closely tied to the Gen4-7 hardware encoding. However, do_wm_prog is very generic: it just calls out to get a compiled program and then uploads it. This isn't ultimately where we want it, but it's a step in the right direction: it's now closer to the code generator. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Pass the brw_context pointer into fs_visitor explicitly.Kenneth Graunke2012-11-263-5/+7
| | | | | | | | We used to steal it out of the brw_compile struct...but fs_visitor isn't going to have one of those in the future. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Move brw_wm_compile::fp to fs_visitor.Kenneth Graunke2012-11-268-17/+19
| | | | | | | | Also change it from a brw_fragment_program to a gl_fragment_program, since that seems to be what everything wants anyway. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Remove struct brw_shader * parameter to fs_visitor constructor.Kenneth Graunke2012-11-263-5/+8
| | | | | | | | | | We can easily recover it from prog, and this makes it clear that we aren't passing additional information in. v2: Use an if-statement rather than the ?: operator (suggested by Eric). Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Move brw_wm_compile::dispatch_width into fs_visitor.Kenneth Graunke2012-11-269-66/+64
| | | | | | | | | | | | | | Also, rather than having brw_wm_fs_emit poke at it directly, make it a parameter to the fs_visitor constructor. All other changes generated by search and replace (with occasional whitespace fixup). v2: Make dispatch_width const (as suggested by Paul); fix doxygen mistake (pointed out by Eric); update for rebase. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Move brw_wm_lookup_iz() to fs_visitor::setup_payload_gen4().Kenneth Graunke2012-11-265-85/+82
| | | | | | | This necessitates compiling brw_wm_iz.c as C++. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Move brw_wm_payload_setup() to fs_visitor::setup_payload_gen6()Kenneth Graunke2012-11-264-68/+63
| | | | | | | | Now that we only have the one backend, there's no real point in keeping this separate. Moving it should allow some future simplifications. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Remove brw_wm_compile::computes_depth field.Kenneth Graunke2012-11-264-6/+1
| | | | | | | | | | Everybody determines this by checking if fp's OutputsWritten field contains the FRAG_RESULT_DEPTH bit. Rather than having payload setup check this and set the computes_depth flag, we can just do the check in the only place that actually used it: emit_fb_writes(). Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Enable ARB_vertex_type_2_10_10_10_rev on Gen4+.Chris Forbes2012-11-261-0/+1
| | | | | | | | v2 (Kayden): Move the enable into an existing intel->gen >= 4 block (as suggested by Ian). Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: emit w/a for packed attribute formats in VSChris Forbes2012-11-263-13/+126
| | | | | | | | | | | | | | | | Implements BGRA swizzle, sign recovery, and normalization as required by ARB_vertex_type_10_10_10_2_rev. V2: Ported to the new VS backend, since that's all that's left; fixed normalization. V3: Moved fixups out of the GLSL-only path, so it works for FF/VP too. V4 (Kayden): Rework ES3 normalization, don't heap allocate registers; tidy comments. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: set attribute w/a bits for packed formatsChris Forbes2012-11-261-4/+26
| | | | | | | | Flag the need for various workarounds to be applied by the vertex shader. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Generalize GL_FIXED VS w/a supportChris Forbes2012-11-263-14/+26
| | | | | | | | | | | Next few patches build on this to add other workarounds for packed formats. V2: rename BRW_ATTRIB_WA_COMPONENTS to BRW_ATTRIB_WA_COMPONENT_MASK; V3 (Kayden): remove separate bit for ES3 signed normalization Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: support 2_10_10_10 formats in get_surface_type.Chris Forbes2012-11-261-1/+19
| | | | | | | | | Always use R10G10B10A2_UINT; Most of the other formats we'd like don't actually work on the hardware. Will emit w/a for scaling, sign recovery and BGRA swizzle in the VS. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: implement get_size for 2_10_10_10 formatsChris Forbes2012-11-261-0/+5
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vs: add support for emitting SHL, SHR, ASRChris Forbes2012-11-262-4/+10
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Use correct glGetTransformFeedbackVarying name in error msgMatt Turner2012-11-261-2/+2
| | | | Reviewed-by: Brian Paul <[email protected]>
* i965: Fix hangs with FP KIL instructions pre-gen6.Eric Anholt2012-11-251-0/+2
| | | | | | | | | We can't support IF statements in 16-wide on these. To get back to 16-wide for these shaders, we need to support predicate on discard instructions in the backend IR, which is something we've sort of got on the list to do anyway. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55828 Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen4: Fix memory leak each time compile_gs_prog() is called.Eric Anholt2012-11-251-1/+1
| | | | | | | | | Commit 774fb90db3e83d5e7326b7a72e05ce805c306b24 introduced a ralloc context to each user of struct brw_compile, but for this one a NULL context was used, causing the later ralloc_free(mem_ctx) to not do anything. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55175 NOTE: This is a candidate for the stable branches.
* i965/gen4: Fix LOD bias texturing since my fixed reg classes change.Eric Anholt2012-11-251-10/+18
| | | | | | | | | | We have a special case where non-shadow comparison with LOD requires using a SIMD16 vec4 in an 8-wide shader, which appears in the register allocator as a size 8 vgrf. Fixes assertions in various piglit tests and webgl conformance. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56521
* scons: Append x11 library path if linking x11 library.Vinson Lee2012-11-211-0/+1
| | | | Signed-off-by: Vinson Lee <[email protected]>
* mesa/vbo: Fix scaling issue in 2-bit signed normalized packing.Kenneth Graunke2012-11-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since a signed 2-bit integer can only represent -1, 0, or 1, it is tempting to simply to convert it directly to a float. This maps it onto the correct range of [-1.0, 1.0]. However, it gives different values compared to the usual equation: (2.0 * 1.0 + 1.0) * (1.0 / 3.0) = +1.0 (same) (2.0 * 0.0 + 1.0) * (1.0 / 3.0) = +0.33333333... (different) (2.0 * -1.0 + 1.0) * (1.0 / 3.0) = -0.33333333... (different) According to the GL_ARB_vertex_type_2_10_10_10_rev extension, signed normalization is performed using equation 2.2 from the GL 3.2 specification, which is: f = (2c + 1)/(2^b - 1). (2.2) Comments below that equation state: "In general, this representation is used for signed normalized fixed-point parameters in GL commands, such as vertex attribute values." Which is what we're doing here. The 3.2 specification goes on to declare an alternate formula: f = max{c/(2^(b-1) - 1), -1.0} (2.3) which is closer to the existing code, and maps the end points to exactly -1.0 and 1.0. Comments below the equation state: "In general, this representation is used for signed normalized fixed-point texture or framebuffer values." Which is *not* what we're doing here. It then states: "Everywhere that signed normalized fixed-point values are converted, the equation used is specified." This is the real clincher: the extension explicitly specifies that we must use equation 2.2, not 2.3. So we need to do (2x + 1) / 3. This matches the behavior expected by oglconform's packed-vertex test, and is correct for desktop GL (pre-4.2). It's not correct for ES 3.0, but a future patch will correct that. Signed-off-by: Kenneth Graunke <[email protected]> Tested-by: Marek Olšák <[email protected]>
* mesa/vbo: Fix scaling issue in 10-bit signed normalized packing.Kenneth Graunke2012-11-211-1/+1
| | | | | | | | | | | | | | | | | | | | For the 10-bit components, the divisor was incorrect. A 10-bit signed integer can represent -2^9 through 2^9 - 1, which leads to the following ranges: (float)value.x -> [ -512, 511] 2.0F * (float)value.x -> [-1024, 1022] 2.0F * (float)value.x + 1.0F -> [-1023, 1023] So dividing by 511 would incorrectly scale it to approximately: [-2.001956947, 2.001956947]. To correctly scale to [-1.0, 1.0], we need to divide by 1023. This correctly implements the desktop GL rules. ES 3.0 has different rules, but those will be implemented in a separate patch. Signed-off-by: Kenneth Graunke <[email protected]> Tested-by: Marek Olšák <[email protected]>