aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Drop the confusing saturate argument to math instruction setup.Eric Anholt2012-08-088-44/+6
| | | | | | | | | | | | | | | | | | | This was ridiculous. We were ignoring the inst->header.saturate flag in the case of math and only math. On gen4, we would leave inst->header.saturate in place if it happened to be set, which would end up being applied to the implicit mov and thus trash the first argument. On gen6, we would overwrite inst->header.saturate with the saturate flag from the argument, which was not set appropriately in brw_vec4_emit.cpp, and was only not a bug due to our incompetence at coalescing saturate moves. By ripping the argument out and making saturate work just like all the other brw_eu_emit.c code generation, we can avoid both these classes of bugs. Fixes piglit fog-modes, and the new specific fs-saturate-exp2 case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48628 NOTE: This is a candidate for the 8.0 branch. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make brw_set_saturate() use stdbool.Eric Anholt2012-08-082-3/+3
| | | | | | There was a chance for brw_wm_emit.c to screw up and pass (1 << 4) instead of 1, which would get converted to 0 when stored. Instead, use stdbool which converts nonzero to true/1 like we want.
* mesa: In conditional rendering fallback, check the query status.Eric Anholt2012-08-081-0/+2
| | | | | | | | | | | Otherwise, conditional rendering always takes the fallthrough "render it anyway" case unless the application had itself done a check or wait on the query. Fixes intel oglconform's conditional_render advanced.nofbo.readpixels. Reviewed-by: Brian Paul <[email protected]> NOTE: This is a candidate for the 8.0 branch.
* mesa: Fix glPopAttrib() behavior on GL_FRAMEBUFFER_SRGB.Eric Anholt2012-08-081-0/+13
| | | | | | | | | I happened to notice this while looking at a blit pass in l4d2, which had an optional push/pop around framebuffer srgb setting. It didn't matter in the end, but the fix is sitting in my tree now. Reviewed-by: Brian Paul <[email protected]> NOTE: This is a candidate for the 8.0 branch.
* i965: Use 64-bit writes for occlusion queries.Kenneth Graunke2012-08-081-2/+3
| | | | | | | | | | | | | | | The hardware seems to use the length of the PIPE_CONTROL command to indicate whether the write is 64-bits or 32-bits. Which makes sense for immediate writes. Daniel discovered this by writing a pattern into the query object bo and noticing that the high 32-bits were left intact, even on those pipe control writes that seemingly worked. Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Refactor depth count write PIPE_CONTROLs into a helper function.Kenneth Graunke2012-08-081-68/+43
| | | | | | | | | | This consolidates the complexity in one place, which is important because it's about to get even more complicated. Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Emit a CS stall before timestamp writes.Kenneth Graunke2012-08-081-0/+14
| | | | | | | | | | This implements one of the Sandybridge PIPE_CONTROL workarounds. It doesn't appear to be required for Ivybridge. Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Use 64-bit writes for timestamp queries.Kenneth Graunke2012-08-081-2/+3
| | | | | | | | | | | | | | | The hardware seems to use the length of the PIPE_CONTROL command to indicate whether the write is 64-bits or 32-bits. Which makes sense for immediate writes. Daniel discovered this by writing a pattern into the query object bo and noticing that the high 32-bits were left intact, even on those pipe control writes that seemingly worked. Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Refactor timestamp write PIPE_CONTROLs into a helper function.Kenneth Graunke2012-08-081-50/+30
| | | | | | | | | This consolidates the complexity in one place, which is important because it's about to get even more complicated. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel: Make the length for PIPE_CONTROL explicit.Kenneth Graunke2012-08-084-20/+20
| | | | | | | | | | | | | | | | PIPE_CONTROL has variable length, depending upon generation and whether we want to do 32-bit or 64-bit data writes. Make it explicit, rather than hiding a length of 4 in the #define for _3DSTATE_PIPE_CONTROL. Generated by s/3DSTATE_PIPE_CONTROL/3DSTATE_PIPE_CONTROL | (4 - 2)/g. This is equivalent since the #define used to have | 2 in it. A grep through the sources shows that all instances have been converted, so it's safe to remove the | 2 from the #define. Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* swrast: add missing switch case for API_OPENGL_COREBrian Paul2012-08-081-0/+2
| | | | | | To silence compiler warning. Reviewed-by: José Fonseca <[email protected]>
* i965: Enable uniform buffer objects on gen6+.Eric Anholt2012-08-071-0/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vs: Add support for loading uniform buffer variables as pull constants.Eric Anholt2012-08-072-2/+55
| | | | | | | | Unlike the FS side in the previous commit, this does variable indexing just fine, using the same code as we used for other variable-indexed pull constants. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add support for loading uniform buffer variables as pull constants.Eric Anholt2012-08-073-1/+50
| | | | | | | | Variable array indexing isn't finished, because the lowering pass turns it all into conditional moves of constant index accesses so I can't test it. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vs: Add a surface index to VS_OPCODE_PULL_CONSTANT instructions.Eric Anholt2012-08-073-10/+17
| | | | | | | Similar to the previous commit for the fragment shader, now we have a buffer index and an offset. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Communicate the pull constant block read parameters through fs_regs.Eric Anholt2012-08-073-6/+20
| | | | | | | | | | | I wanted to add the surface index as a variable value for UBO support, and a reg seemed like the obvious way to go. This exposes more of the information to CSE, which we'll probably want to apply to pull constant loads for UBOs eventually (you might access 4 floats in a row, each of which would produce an oword block read of the same block). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Bind UBOs as surfaces like we do for pull constants.Eric Anholt2012-08-076-3/+110
| | | | | | v2: Comment fix, drop extraneous parens (review by Kenneth) Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add an offset argument to constant buffer setup.Eric Anholt2012-08-075-6/+11
| | | | | | We'll use this for UBO surfaces. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add support for glUniformBlockBinding() in display lists.Eric Anholt2012-08-071-0/+27
| | | | | | | | | Fixes piglit GL_ARB_uniform_buffer_object/dlist. v2: Use the .ui fields instead of .i for type consistency (review by Brian Paul) Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Unbind uniform buffer bindings on glDeleteBuffers().Eric Anholt2012-08-071-0/+7
| | | | | | Fixes piglit GL_ARB_uniform_buffer_object/deletebuffers. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Default to GL 3.1's limits on uniform blocks.Eric Anholt2012-08-071-11/+15
| | | | | | | | | | | | The ARB spec lets you get away with the default block counting against the blocks for combined size limits. The core spec says you need to be able to support the maximum size of default block *and* the maximum size of each uniform block. I see no reason that any driver would have a problem with that. Fixes gl 3.1/minmax (with an associated fix to the test) Reviewed-by: Kenneth Graunke <[email protected]>
* ir_to_mesa: Don't whack the ->location field of uniform block variables.Eric Anholt2012-08-071-1/+1
| | | | | | Fixes some failures in GL_ARB_uniform_buffer_object/maxblocks. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Make glBindBufferBase/glBindBufferRange() work on just-genned names.Eric Anholt2012-08-071-12/+25
| | | | | | | | | | | | | | | | | | | | | In between glGenBuffers() and glBindBuffer(), the buffer object points to this dummy buffer with a name of 0, and a glBindBufferBase() would point to that. It seems pretty clear, given that glBindBufferBase() only cares about the current size of the buffer at render time, that it should bind up the buffer that you passed in instead of pointing it at this useless dummy buffer. However, what should glBindBufferRange() do? As of this patch, it will promote the genned buffer to a proper buffer like it had been glBindBuffer()ed, and then detect that the size is greater than the buffer's current size of 0 and throw INVALID_VALUE. It seems like the most reasonable answer here. Note that this also changes the behavior of these two on non-glGenBuffers() bo names. We haven't yet set up the error throwing for glBindBuffers() on gl 3.1+, and my assumption is that these two functions should inherit their behavior on un-genned names from glBindBuffers(). Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add a "ubo_load" expression type for fetches from UBOs.Eric Anholt2012-08-075-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drivers will probably want to be able to take UBO references in a shader like: uniform ubo1 { float a; float b; float c; float d; } void main() { gl_FragColor = vec4(a, b, c, d); } and generate a single aligned vec4 load out of the UBO. For intel, this involves recognizing the shared offset of the aligned loads and CSEing them out. Obviously that involves breaking things down to loads from an offset from a particular UBO first. Thus, the driver doesn't want to see variable_ref(ir_variable("a")), and even more so does it not want to see array_ref(record_ref(variable_ref(ir_variable("a")), "field1"), variable_ref(ir_variable("i"))). where a.field1[i] is a row_major matrix. Instead, we're going to make a lowering pass to break UBO references down to expressions that are obvious to codegen, and amenable to merging through CSE. v2: Fix some partial thoughts in the ir_binop comment (review by Kenneth) Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Replace VersionMajor/VersionMinor with a Version field.Eric Anholt2012-08-0715-58/+53
| | | | | | | | | | | As we get into supporting GL 3.x core, we come across more and more features of the API that depend on the version number as opposed to just the extension list. This will let us more sanely do version checks than "(VersionMajor == 3 && VersionMinor >= 2) || VersionMajor >= 4". v2: Fix a bad <= 30 check. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Fix compiler warnings from winsys msaa.Eric Anholt2012-08-072-3/+1
|
* intel: Advertise multisample DRI2 configs on gen >= 6Chad Versace2012-08-071-3/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This turns on window system MSAA. This patch changes the id of many GLX visuals and configs, but that couldn't be prevented. I attempted to preserve the id's of extant configs by appending the multisample configs to the end of the extant ones. But somewhere, perhaps in the X server, the configs are reordered with multisample configs interspersed among the singlesample ones. Test results: Tested with xonotic and `glxgears -samples 1` on Ivybridge. No piglit regressions on Ivybridge. On Sandybridge, passes 68/70 of oglconform's winsys multisample tests. The two failing tests are: multisample(advanced.pixelmap.depth) multisample(advanced.pixelmap.depthCopyPixels) These tests hang the gpu (on kernel 3.4.6) due to a glDrawPixels/glReadPixels pair on an MSAA depth buffer. I don't expect realworld apps to do that, so I'm not too concerned about the hang. On Ivybridge, passes 69/70. The failing case is multisample(advanced.line.changeWidth). Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Clarify intel_screen_make_configsChad Versace2012-08-071-20/+16
| | | | | | | | | | | | | | This function felt sloppy, so this patch cleans it up a little bit. - Rename `color` to `i`. It is not a color value, only an iterator int. - Move `depth_bits[0] = 0` into the non-accum loop because that is where it used. The accum loop later overwrites depth_bits[0]. - Rename `depth_factor` to `num_depth_stencil_bits`. - Redefine `msaa_samples_array` as static const because it is never modified. Rename to `singlesample_samples`. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* dri: Simplify use of driConcatConfigsChad Versace2012-08-074-14/+9
| | | | | | | | | | | | If either argument to driConcatConfigs(a, b) is null or the empty list, then simply return the other argument as the resultant list. All callers were accomplishing that same behavior anyway. And each caller accopmplished it with the same pattern. So this patch moves that external pattern into the function. Reviewed-by: <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Refactor creation of DRI2 configsChad Versace2012-08-071-91/+98
| | | | | | | | | | DRI2 configs were constructed in intelInitScreen2. That function already does too much, so move verbatim the code for creating configs to a new function, intel_screen_make_configs. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Downsample on DRI2 flushChad Versace2012-08-071-0/+31
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Support mapping multisample miptreesChad Versace2012-08-072-6/+126
| | | | | | | | | Add two new functions: intel_miptree_{map,unmap}_multisample, to which intel_miptree_{map,unmap} dispatch. Only mapping flat, renderbuffer-like miptrees are supported. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Refactor use of intel_miptree_mapChad Versace2012-08-071-15/+50
| | | | | | | | | | Move the opencoded construction and destruction of intel_miptree_map into new functions, intel_miptree_attach_map and intel_miptree_release_map. This patch prevents code duplication in a future commit that adds support for mapping multisample miptrees. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Refactor intel_miptree_map/unmapChad Versace2012-08-071-17/+50
| | | | | | | | | | | Move the body of intel_miptree_map into a new function, intel_miptree_map_singlesample. Now intel_miptree_map dispatches to the new function. A future commit adds a multisample variant. Ditto for intel_miptree_unmap. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Mark needed downsamples for msaa winsys buffersChad Versace2012-08-074-6/+29
| | | | | | | | | | | | | Add function intel_renderbuffer_set_needs_downsample. It is a no-op except on multisample winsys buffers shared with DRI2. Mark the needed downsamples with the new function at two locations: - Immediately after drawing is complete. - After blitting. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Define functions for up/downsampling on miptreesChad Versace2012-08-071-2/+72
| | | | | | | Flesh out the stub functions intel_miptree_{up,down}sample. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Add function brw_blorp_blit_miptreesChad Versace2012-08-072-4/+37
| | | | | | | | | | Define a function, brw_blorp_blit_miptrees, that simply wraps brw_blorp_blit_params + brw_blorp_exec with C calling conventions. This enables intel_miptree.c, in a following commit, to perform blits with blorp for the purpose of downsampling multisample miptrees. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Allocate miptree for multisample DRI2 buffersChad Versace2012-08-073-8/+162
| | | | | | | | | | | | | | | | | | | | Immediately after obtaining, with DRI2GetBuffersWithFormat, the DRM buffer handle for a DRI2 buffer, we wrap that DRM buffer handle with a region and a miptree. This patch additionally allocates an accompanying multisample miptree if the DRI2 buffer is multisampled. Since we do not yet advertise multisample GL configs, the code for allocating the multisample miptree is currently inactive. This patch adds the following fields to intel_mipmap_tree: singlesample_mt needs_downsample and the following function stubs: intel_miptree_downsample intel_miptree_upsample Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Refactor creation of hiz and mcs miptreesChad Versace2012-08-072-16/+19
| | | | | | | | | | | | | | Move the logic for creating the ancillary hiz and mcs miptress for winsys and non-texture renderbuffers from intel_alloc_renderbuffer_storage to intel_miptree_create_for_renderbuffer. Let's try to isolate complex miptree logic to intel_mipmap_tree.c. Without this refactor, code duplication would be required along the intel_process_dri2_buffer codepath in order to create the mcs miptree. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Set num samples for winsys renderbuffersChad Versace2012-08-073-11/+21
| | | | | | | | | | | | | | Add a new param, num_samples, to intel_create_renderbuffer and intel_create_private_renderbuffer. No multisample GL config is yet advertised, so the value of num_samples is currently 0. For server-owned winsys buffers, gl_renderbuffer::NumSamples is not yet used. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]> (v1) Signed-off-by: Chad Versace <[email protected]>
* intel: Refactor quantize_num_samplesChad Versace2012-08-072-3/+7
| | | | | | | | | | | Rename quantize_num_samples to intel_quantize_num_samples and change the first param from struct intel_context* to struct intel_screen*. The function will later be used by intelCreateBuffer, which is not bound to any context but is bound to a screen. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]> (v1) Signed-off-by: Chad Versace <[email protected]>
* intel: Update stale comment for intel_miptree_slice::mapChad Versace2012-08-071-2/+2
| | | | | | | The comment referred to intel_tex_image_map/unmap, but should more accurately refer to intel_miptree_map/unmap. Signed-off-by: Chad Versace <[email protected]>
* i965: add more Haswell PCI IDsPaulo Zanoni2012-08-072-4/+98
| | | | | | Signed-off-by: Paulo Zanoni <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: Fix a potential memory leak in get_mesa_program.Vinson Lee2012-08-061-1/+2
| | | | | | | Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* dri2: Fix bug in attribute handling for non-desktop OpenGL contextsIan Romanick2012-08-061-6/+17
| | | | | | | | | | | | | | | Previously an error would be generated if any attributes were specified when creating a non-desktop OpenGL context. This was a mistake, and it will prevent old drivers from working with new EGL libraries that add support for the createContextAttribs interface. Instead, match the behavior of EGL_KHR_create_context: allow versions that make sense, reject non-zero flags. NOTE: This is a candidate for the 8.0 branch. Signed-off-by: Ian Romanick <[email protected]> Cc: Kristian Høgsberg <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Allocate dummy slots for point sprites before computing VUE map.Kenneth Graunke2012-08-061-2/+2
| | | | | | | | | | | | | | Commit f0cecd43d6b6d moved the VUE map computation to be only once, at VS compile time. However, it did so in slightly the wrong place: it made the one call to brw_vue_compute_map happen right before the allocation of dummy slots for replaced point sprite coordinates, causing a different VUE map to be generated (at least on Ironlake). Fixes a regression in Piglit's point-sprite test on Ironlake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46489 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/vs: Don't clobber sampler message MRFs with subexpressions.Kenneth Graunke2012-08-061-17/+42
| | | | | | | | | | | | See the preceding commit for a description of the problem. NOTE: This is a candidate for stable release branches. v2: Use a separate dPdx variable rather than reusing the lod src_reg. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Don't clobber sampler message MRFs with subexpressions.Kenneth Graunke2012-08-062-70/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider a texture call such as: textureLod(s, coordinate, log2(...)) First, we begin setting up the sampler message by loading the texture coordinates into MRFs, starting with m2. Then, we realize we need the LOD, and go to compute it with: ir->lod_info.lod->accept(this); On Gen4-5, this will generate a SEND instruction to compute log2(), loading the operand into m2, and clobbering our texcoord. Similar issues exist on Gen6+. For example, nested texture calls: textureLod(s1, c1, texture(s2, c2).x) Any texturing call where evaluating the subexpression trees for LOD or shadow comparitor would generate SEND instructions could potentially break. In some cases (like register spilling), we get lucky and avoid the issue by using non-overlapping MRF regions. But we shouldn't count on that. Fixes four Piglit test regressions on Gen4-5: - glsl-fs-shadow2DGradARB-{01,04,07,cumulative} NOTE: This is a candidate for stable release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Factor out texcoord setup into a helper function.Kenneth Graunke2012-08-062-11/+28
| | | | | | | | | With the textureRect support and GL_CLAMP workarounds, it's grown sufficiently that it deserves its own function. Separating it out makes the original function much more readable. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Move message header and texture offset setup to generate_tex().Kenneth Graunke2012-08-063-21/+27
| | | | | | | | | | | | Setting the texture offset bits in the message header involves very specific hardware register descriptions. As such, I feel it's better suited for the lower level "generate" layer that has direct access to the weird register layouts, rather than at the fs_inst abstraction layer. This also parallels the approach I took in the VS backend. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>