summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965: Enable ARB_shading_language_packingMatt Turner2013-01-251-0/+1
| | | | | Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Assert that the 4x8 pack/unpack operations have been loweredMatt Turner2013-01-253-0/+12
| | | | | Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Lower the 4x8 pack/unpack operationsMatt Turner2013-01-251-1/+5
| | | | | Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Pass in the glarray to get_surface_type.Eric Anholt2013-01-251-29/+22
| | | | | | | Dereffing all the values in the two callers was just pointless, and the function isn't inlined so there was actual code impact. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove nonsense comment.Eric Anholt2013-01-251-2/+0
| | | | | | vb.inputs_read has never been a thing, even in the initial import. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove NDEBUG undef that was snuck in.Eric Anholt2013-01-251-2/+0
| | | | | | If you want debug, set --enable-debug in your config flags. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: reuse _mesa_sizeof_type for index buffer types.Eric Anholt2013-01-251-24/+2
| | | | | | | The core Mesa code has just one more case than this (GL_BITMAP), so I don't see any cause to special-case it. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Reuse precalculated ib_type_size value.Eric Anholt2013-01-251-1/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop debug check for knowing the size of a type.Eric Anholt2013-01-251-2/+1
| | | | | | | | This was added in b93684f5f311f89c965960ab42bfea71a397b180, but there's no need for it -- get_size has to succeed, and it has an assert for us in debug builds. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Stop worrying about alignment of vertex data.Eric Anholt2013-01-251-7/+1
| | | | | | | | | For our current types, the required alignment is actually just 1 byte. When we get doubles, we have to worry (those have to be aligned to the natural size), but we don't have doubles yet and they'll just be a special case. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use the glarray _ElementSize that Mesa tracks for us.Eric Anholt2013-01-252-8/+4
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add ir_variable::is_in_uniform_block predicateIan Romanick2013-01-252-2/+2
| | | | | | | | | The way a variable is tested for this property is about to change, and this makes the code easier to modify. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Carl Worth <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add GLSL_TYPE_INTERFACEIan Romanick2013-01-254-0/+4
| | | | | | | | | | | | | | | | | | | | | Interfaces are structurally identical to structures from the compiler's point of view. They have some additional restrictions, and generally GPUs use different instructions to access them. Using a different base type should make this a bit easier. This commit also adds the glsl_type::interface_packing fields. For GLSL_TYPE_INTERFACE types, this will track the specified packing mode. It is analogous to gl_uniform_buffer::_Packing. v2: Add serveral missing GLSL_TYPE_INTERFACE cases in switch-statements. v3: Add information about glsl_type::interface_packing. Move row_major checking in glsl_type::record_key_compare from this patch to the previous patch. Both suggested by Paul Berry. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Replace most default cases in switches on GLSL typeIan Romanick2013-01-254-7/+17
| | | | | | | | | | | | | | | This makes it easier to find switch-statements that need to be updated after a new GLSL_TYPE_* is added because the compiler will generate a warning. Switch-statements that only had a small number of cases (e.g., everything in ir_constant_expression.cpp) were not modified. I may regret that decision when we eventually add support for doubles. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Carl Worth <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Correct gen6+ guardband calculation.Eric Anholt2013-01-252-9/+21
| | | | | | | | | | | Too much attention was paid to the first paragraphs, and not enough to the last little note that "oh, by the way, the rendered things themselves still have to be clipped to just 8192 wide/high". Fixes GTF's clip.c test with 4096 or higher width on ivb, where one of the triangles got the upper half of its pixels dropped. Tested-by: Ian Romanick <[email protected]>
* i965: Use GL_RED for DEPTH_TEXTURE_MODE in ES 3.0 for unsized formats.Kenneth Graunke2013-01-254-7/+21
| | | | | | | | | | | | | | | | | | | | | Khronos has apparently decided that depth textures with sized formats (allowed with ARB_internalformat_query or ES 3.0) should be treated as GL_RED, while unsized formats (an existing feature) should be treated as GL_INTENSITY for compatibility with ES 2.0. Ian is proposing changes to ARB_internalformat_query which will make this actually legal and consistent. A similar problem exists with GL 4.2, but we're going to ignore that for the time being. Tested on Ivybridge: no Piglit regressions; fixes 4 es3conform tests: - depth_texture_fbo - depth_texture_fbo_clear - depth_texture_teximage - depth_texture_texsubimage Reviewed-by: Ian Romanick <[email protected]>
* i965: Bump maximum supported ES2 context version to 3.0Chad Versace2013-01-251-1/+1
| | | | | | | | | | Since patch "i965: Validate requested GLES context version in brwCreateContext", we have been able to create ES 3.0 contexts due to the max version check. So...bump the max version. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965/Gen6+: Enable ARB_ES3_compatibility extensionPaul Berry2013-01-251-0/+1
| | | | Reviewed-by: Ian Romanick <[email protected]>
* i965/fs/gen7: Fix fatal typo in unpackHalf2x16Chad Versace2013-01-241-1/+1
| | | | | | | | | s/src/src_w/ That little typo, which sneaked into v4 of the previous patch, generates incorrect fs code. Signed-off-by: Chad Versace <[email protected]>
* i965/fs/gen7: Emit code for GLSL 3.00 pack/unpack operations (v4)Chad Versace2013-01-245-3/+144
| | | | | | | | | | | | | v2: Remove lewd comment. [for idr] v3: - Optimize away tmp register for packHalf2x16. [for anholt, paul] - Improve comments. [for anholt, paul] - Reduce near-duplicate code by removing vec4_visitor emit_pack/unpack methods. [for chadv] v4: Factor our UD/W register conversion into helper function. [for anholt] Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (v2) Signed-off-by: Chad Versace <[email protected]>
* i965/vs/gen7: Emit code for GLSL ES 3.00 pack/unpack operations (v3)Chad Versace2013-01-243-0/+146
| | | | | | | | | | | | | | FIXME: This patch emits VS code that violates documented hardware restrictions and then relies on undocumented behavior that results from that violation. This patch passes all tests, but should be fixed ASAP to conform to the hardware documentation. v2: Explain undocumented hardware behavior. Improve comments. v3: Use ALU1 helper methods F32TO16() and F16TO32(). [for anholt] Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (v1) Signed-off-by: Chad Versace <[email protected]>
* i965: Quote the PRM on a HorzStride subtletyChad Versace2013-01-241-1/+4
| | | | | Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Add opcodes for F32TO16 and F16TO32Chad Versace2013-01-244-0/+8
| | | | | | | | | | | | | The GLSL ES 3.00 operations packHalf2x16 and unpackHalf2x16 will emit these opcodes. - Define the opcodes BRW_OPCODE_{F32TO16,F16TO32}. - Add the opcodes to the brw_disasm table. - Define convenience functions brw_{F32TO16,F16TO32}. Reviewed-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Lower the GLSL ES 3.00 pack/unpack operations (v2)Chad Versace2013-01-241-0/+32
| | | | | | | | | | | | | | | | | On gen < 7, we fully lower all operations to arithmetic and bitwise operations. On gen >= 7, we fully lower the Snorm2x16 and Unorm2x16 operations, and partially lower the Half2x16 operations. v2: - Comment that scalarization is needed only for SOA code [for idr]. - Replace switch-statement with if-statement [for idr]. - Remove misplaced hunk from previous patch [found by idr]. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Tuner <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965/disasm: Fix horizontal stride of dest registersChad Versace2013-01-241-3/+6
| | | | | | | | | | The bug: The printed horizontal stride was the numerical value of the BRW_HORIZONTAL_$N enum. The fix: Translate the enum before printing. Note: This is a candidate for the stable releases. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Fix glCopyTexSubImage on buffers whose width >= 32kbytesPaul Berry2013-01-241-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When possible, glCopyTexSubImage calls are performed using the hardware blitter. However, according to the Ivy Bridge PRM, Vol1 Part4, section 1.2.1.2 (Graphics Data Size Limitations): The BLT engine is capable of transferring very large quantities of graphics data. Any graphics data read from and written to the destination is permitted to represent a number of pixels that occupies up to 65,536 scan lines and up to 32,768 bytes per scan line at the destination. The maximum number of pixels that may be represented per scan line’s worth of graphics data depends on the color depth. With an RGBA32F color buffer (which has 16 bytes per pixel) this imposes a maximum width of 2048 pixels. Other pixel formats have accordingly larger limits. To make matters worse, if the pitch of the buffer is 32k or greater, intel_copy_texsubimage's call to intelEmitCopyBlit will overflow intelEmitCopyBlit's src_pitch and dst_pitch parameters (which are 16-bit signed integers). We can conveniently avoid both problems by avoiding use of the blitter when the miptree's pitch is >= 32k. Fixes gles3conform "framebuffer_blit_functionality_magnifying_blit" tests when the buffer width is equal to 8192. Note: this is very similar to the recent patch "intel: Fix ReadPixels on buffers whose width >= 32kbytes" except that it applies to glCopyTexSubImage instead of glReadPixels. In a future patch it would be nice to refactor the code so that (a) overflow is avoided, and (b) intelEmitCopyBlit is responsible for checking whether the blitter can handle the width, so that all callers of intelEmitCopyBlit work properly, rather than just these two. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Eliminate ambiguity between function ins/outs and shader ins/outsPaul Berry2013-01-244-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces the three ir_variable_mode enums: - ir_var_in - ir_var_out - ir_var_inout with the following five: - ir_var_shader_in - ir_var_shader_out - ir_var_function_in - ir_var_function_out - ir_var_function_inout This eliminates a frustrating ambiguity: it used to be impossible to tell whether an ir_var_{in,out} variable was a shader in/out or a function in/out without seeing where the variable was declared in the IR. This complicated some optimization and lowering passes, and would have become a problem for implementing varying structs. In the lisp-style serialization of GLSL IR to strings performed by ir_print_visitor.cpp and ir_reader.cpp, I've retained the names "in", "out", and "inout" for function parameters, to avoid introducing code churn to the src/glsl/builtins/ir/ directory. Note: a couple of comments in the code seemed to indicate that we were planning for a possible future in which geometry shaders could have shader-scope inout variables. Our GLSL grammar rejects shader-scope inout variables, and I've been unable to find any evidence in the GLSL standards documents (or extensions) that this will ever be allowed, so I've eliminated these comments. Reviewed-by: Carl Worth <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/vs: Do headerless texturing for texelFetchOffset().Kenneth Graunke2013-01-241-2/+4
| | | | | | | | | For texelFetchOffset(), we just add the texel offsets to the coordinate rather than using the message header's offset fields. So we don't actually need a header on Gen5+. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel: Fix ReadPixels on buffers whose width >= 32kbytesPaul Berry2013-01-241-4/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When possible, glReadPixels calls are performed using the hardware blitter. However, according to the Ivy Bridge PRM, Vol1 Part4, section 1.2.1.2 (Graphics Data Size Limitations): The BLT engine is capable of transferring very large quantities of graphics data. Any graphics data read from and written to the destination is permitted to represent a number of pixels that occupies up to 65,536 scan lines and up to 32,768 bytes per scan line at the destination. The maximum number of pixels that may be represented per scan line’s worth of graphics data depends on the color depth. With an RGBA32F color buffer (which has 16 bytes per pixel) this imposes a maximum width of 2048 pixels. To make matters worse, if the pitch of the buffer is 32k or greater, intel_miptree_map_blit's call to intelEmitCopyBlit will overflow intelEmitCopyBlit's src_pitch and dst_pitch parameters (which are 16-bit signed integers). We can conveniently avoid both problems by avoiding the readpixels blit path when the miptree's pitch is >= 32k. Fixes gles3conform "half_float" tests when the buffer width is greater than 2048. Reviewed-by: Eric Anholt <[email protected]> Tested-by: Ian Romanick <[email protected]>
* intel: callocing a 32 byte temp is silly, so don'tIan Romanick2013-01-241-3/+3
| | | | | | | | | I believe that the size used to vary, so the dynamic allocation is necessary. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel: Enable S3TC extensions alwaysIan Romanick2013-01-231-6/+4
| | | | | | | | | | | | | | | | | | | | | Always enable the use of pre-compressed texture data. The ability to perform on-line compression still requires the presence of libtxc_dxtn or an explicit driconf over-ride. Applications that just want to submit precompessed data when an on-line compressor is not available can look for the GL_EXT_texture_compression_dxt1 and GL_ANGLE_texture_compression_dxt[35] extensions. v2: Only enable the extensions that do not require on-line compression by default. The previous statement "This should not impact many (if any) real applications." proved to be false for at least Sauerbraten. This application mostly submits pre-compressed data, but it also can submit uncompressed data that it asks the driver to compress. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Jordan Justen <[email protected]> [v1] Reviewed-by: Kenneth Graunke <[email protected]> [v1] Acked-by: Eric Anholt <[email protected]> [v1] Acked-by: Lee Salzman <[email protected]>
* mesa: Use a single flag for the S3TC extensions that don't require on-line ↵Ian Romanick2013-01-236-6/+7
| | | | | | | | compression Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Acked-by: Lee Salzman <[email protected]>
* i965: Use swizzles to force R, G, and B to 0.0 for ALPHA textures.Carl Worth2013-01-231-3/+10
| | | | | | | | | | | | | | | | | Similar to the previous commit, we may be using a texture with actual RGBA storage for the GL_ALPHA format, so force the color values to 0.0. This commit fixes the following piglit (sub) tests: EXT_texture_snorm/fbo-blending-formats GL_ALPHA16_SNORM GL_ALPHA8_SNORM GL_ALPHA_SNORM Note: Haswell bypasses this swizzle code, so may require an independent fix for this bug. Reviewed-by: Eric Anholt <[email protected]>
* i965: Use swizzles to force alpha to 1.0 for RED, RG, or RGB textures.Carl Worth2013-01-231-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We may be using a texture with actual RGBA storage for these formats, so force the alpha value read to 1.0. This commit fixes the following piglit (sub) tests: ARB_texture_float/fb-blending-formats GL_RGB16F_ARB EXT_framebuffer_object/fbo-blending-formats GL_RGB10 GL_RGB12 GL_RGB16 EXT_texture_snorm/fbo-blending-formats GL_RGB16_SNORM GL_RGB8_SNORM GL_RGB_SNORM These test improvements depend on the previous commit as well. That commit smashes alpha to 1.0 for the case of ReadPixels (so fixes "FBO testing" as reported by this test), while this commit smashes alpha to 1.0 for the case of texturing (fixed the "window testing" as reported by this test). Note: Haswell bypasses this swizzle code, so may require an independent fix for this bug. Reviewed-by: Eric Anholt <[email protected]>
* i965: Examine _BaseFormat when deciding to perform xRGB_alpha fixupsCarl Worth2013-01-231-1/+2
| | | | | | | | | | | | | | | | | | | | | The renderbuffer's Format field may have an alpha channel even when the underlying _BaseFormat does not. This can happen when mesa chooses to use RGBA16 for an RGB16 format, for example. So look at _BaseFormat when deciding whether to fixup the blend factors. This test improves the results of at least the following piglit tests: EXT_frambebuffer_object/fbo-blending-formats {GL_RGB10, GL_RGB12, GL_RGB16} EXT_texture_snorm/fbo-blending-formats {GL_RGB16_SNORM, GLRGB8_SNORM, GL_RGB_SNORM} But none of these actually change from FAIL to PASS yet. The R, G, and B probe values are fixed with this commit, but the tests still fail because the alpha values are still wrong. Reviewed-by: Eric Anholt <[email protected]>
* wmesa: include api_exec.h to fix compilationBrian Paul2013-01-221-0/+1
|
* i965: Implement the GL_ARB_base_instance extension.Kenneth Graunke2013-01-222-2/+4
| | | | | | | | | | | | | Thanks to Fredrik Höglund, all the hard work was already done. Tested using a modified oglconform (that actually runs these tests on our driver); it looks like there may be some bugs when using client arrays. All applicable non-compatibility tests passed. For now, only enable it in core profiles. Reviewed-by: Eric Anholt <[email protected]> Tested-by: Ian Romanick <[email protected]>
* mesa: Make the drivers call a non-code-generated dispatch table setup.Eric Anholt2013-01-2110-10/+10
| | | | | | | I want to drive the Save dispatch table setup from this same function. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Remove the dead PrepareExecBegin() driver hook.Eric Anholt2013-01-211-1/+0
| | | | | | | This was used in i965 for a while, but no more. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* scons: Fix dependencies of generated headers.José Fonseca2013-01-212-5/+2
| | | | | | | | | | | | | | It appears that scons implicit dependency scanners fail to chain dependencies of generated headers when these are outside the build tree. This patch ensures generated source files are _always_ put in the build tree. I'm not 100% this will fix all depency issues, but from my experiments it does seem to fix this. NOTE: For this to be effective it is necessary to clean the source tree from generated header/source files. Reviewed-by: Brian Paul <[email protected]>
* intel: Don't expose XRGB8888 visuals any moreIan Romanick2013-01-211-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | There really isn't any point. There is no resource savings, and we have to do gymnastics in the driver to make it work. There are also bad interactions with multisampling and OpenGL ES 3.0. In ES3, a multisample-to-singlesample blit must have identical source and destination format. This means a multisample RGBA8 to singlesample RGB8 (window) blit will generate an error. Also in ES3, RGB8 is not a renderable format. This means that the application CANNOT make an RGB8 multisample renderbuffer. As a result, if an application gets an RGB8 window and wants to do multisample FBO rendering, it will probably break. "Fixes" gles3conform framebuffer_blit_functionality_multisampled_to_singlesampled_blit test on RGB8 visuals. v2: Fix 'formats' array size. Suggested by Ken. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Enable floating-point textures alwaysIan Romanick2013-01-212-20/+5
| | | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* xmlpool/build: generate options.h via BUILT_SOURCESMatt Turner2013-01-201-1/+1
| | | | | | | Fixes missing options.h when doing 'make check' in dri/common before 'make' has been run. Reviewed-by: Andreas Boll <[email protected]>
* intel: Enable GL_OES_depth_texture_cube_mapIan Romanick2013-01-201-0/+1
| | | | | | | | | | For now I'm just enabling this on the same subset of hardware that has OpenGL 3.0 enabled. This same functionality is part of OpenGL 3.0, and there is no matching desktop extension. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add support for GL_ARB_texture_buffer_object_rgb32.Eric Anholt2013-01-181-0/+1
| | | | | | Tested with piglit ARB_texture_buffer_object/formats. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add support for MESA_FORMAT_RGB_FLOAT32 surfaces.Eric Anholt2013-01-181-1/+1
| | | | | | | | | This is for GL_ARB_texture_buffer_object_rgb32 support, but it also causes the format to get used for float32 rgb textures as well on Ironlake and later. Since that came with some surprises, separate the change from the enable commit. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Make intel_region's pitch be bytes instead of pixels.Eric Anholt2013-01-1821-77/+67
| | | | | | | | | | | | We almost never want a stride in pixels -- if you're doing anything with a stride, you're specifying an offset or incrementing a pointer, and in both cases you had to multiply by cpp to get the bytes value you wanted. But worse, on the way to creating a region from a new tiled BO, we divided by cpp to get pitch in pixels, and for an RGB32 buffer (an upcoming change) the pitch wouldn't divide exactly, and we'd end up with a wrong stride in our region. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Make intel_blit.c take pitches in bytes.Eric Anholt2013-01-188-19/+15
| | | | | | As we gain support for NPOT cpp, a pitch may not divide by cpp cleanly. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vs: Store texturing results into a vec4 temporary.Kenneth Graunke2013-01-181-6/+7
| | | | | | | | | | | | | | | | The sampler appears to ignore writemasks (even when correcting the WRITEMASK_XYZW in brw_vec4_emit.cpp to the proper writemask) and just always writes all four values. To cope with this, just texture into a temporary, then MOV out into a register that has the proper number of components. NOTE: This is a candidate for stable branches. Fixes es3conform's shadow_execution_vert.test. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vs: Set LOD to 0 for ordinary texture() calls.Kenneth Graunke2013-01-181-2/+5
| | | | | | | | | Previously it was left undefined, causing us to select a random LOD. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>