summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri
Commit message (Collapse)AuthorAgeFilesLines
* intel: Always downsample in intel_miptree_map_multisampleChad Versace2012-08-091-3/+0
| | | | | | | | | | | | | Always downsample before mapping, even if the map mode contains GL_MAP_INVALIDATE_RANGE_BIT. If we neglect to downsample when only a subrect is mapped then the upsample in intel_miptree_unmap_multisample may write garbage to the region outside the subrect. (Eric gave my patch e88cfbb a conditional reviewed-by with the condition that it always downsample before mapping. I forgot to make that change before pushing the patch.) Signed-off-by: Chad Versace <[email protected]>
* i965/gen6+: Add support for edge flags.Eric Anholt2012-08-093-6/+51
| | | | | | | Fixes the 3 new piglit edgeflag tests. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=40707 Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vs: Convert EdgeFlagPointer values appropriately for the VS on gen4.Eric Anholt2012-08-091-0/+10
| | | | | | | Fixes piglit gl-2.0/edgeflag. NOTE: This is a candidate for the 8.0 branch. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vs: Add comment noting copy_edgeflag state dependency.Eric Anholt2012-08-091-0/+2
| | | | | | It's already in the state struct. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vs: Add support for copying user edge flags.Eric Anholt2012-08-091-2/+11
| | | | | | | | | Fixes the glsl skinning demo regression since changing to the new GLSL compiler, and is part of fixing piglit gl-2.0-edgeflag. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50079 NOTE: This is a candidate for the 8.0 branch. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix the FS inputs setup when some SF outputs aren't used in the FS.Olivier Galibert2012-08-092-2/+25
| | | | | | | | | | | | | If there was an edge flag or a two-side-color pair present, we'd end up mismatched and read values from earlier in the VUE for later FS inputs. v2: Fix regression in gles2conform shaders generating point size. (change by anholt) Signed-off-by: Olivier Galibert <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> NOTE: This is a candidate for the 8.0 branch.
* intel: use _mesa_meta_Clear with OpenGL ES 1.1 v2Tapani Pälli2012-08-082-4/+9
| | | | | | | | | | | | | | | Patch changes i915 and i965 drivers to use fixed function version of meta clear when running on ES 1.1. This fixes rendering errors seen with Google Maps, Angry Birds and Gallery3D on Android platform. Change 88128516d43be5d25288ff5b64db63cda83c04b3 exposes all extensions internally to be available independent of GL flavour, therefore check against ARB_fragment_shader does not work. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50333 Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Rework the extra flushes surrounding occlusion queries.Kenneth Graunke2012-08-081-7/+4
| | | | | | | | | | | | This removes the CS stall on Ivybridge. On Sandybridge, the depth stall needs to be preceded by a non-zero post-sync op, which requires a CS stall, which needs a stall at scoreboard. Emit the full workaround. Reviewed-by: Daniel Vetter <[email protected]> Cc: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965/vs: Protect pow(x,y) MOV of y on gen4 from other instruction flags.Eric Anholt2012-08-081-0/+4
| | | | | | | | | I don't know if it was possible to trigger this bug -- we don't merge saturates into the math instruction because we're bad at coalescing currently, and there's nothing generating these with predicates. Still, let's avoid future bugs when we do smarter codegen. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop the confusing saturate argument to math instruction setup.Eric Anholt2012-08-088-44/+6
| | | | | | | | | | | | | | | | | | | This was ridiculous. We were ignoring the inst->header.saturate flag in the case of math and only math. On gen4, we would leave inst->header.saturate in place if it happened to be set, which would end up being applied to the implicit mov and thus trash the first argument. On gen6, we would overwrite inst->header.saturate with the saturate flag from the argument, which was not set appropriately in brw_vec4_emit.cpp, and was only not a bug due to our incompetence at coalescing saturate moves. By ripping the argument out and making saturate work just like all the other brw_eu_emit.c code generation, we can avoid both these classes of bugs. Fixes piglit fog-modes, and the new specific fs-saturate-exp2 case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48628 NOTE: This is a candidate for the 8.0 branch. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make brw_set_saturate() use stdbool.Eric Anholt2012-08-082-3/+3
| | | | | | There was a chance for brw_wm_emit.c to screw up and pass (1 << 4) instead of 1, which would get converted to 0 when stored. Instead, use stdbool which converts nonzero to true/1 like we want.
* i965: Use 64-bit writes for occlusion queries.Kenneth Graunke2012-08-081-2/+3
| | | | | | | | | | | | | | | The hardware seems to use the length of the PIPE_CONTROL command to indicate whether the write is 64-bits or 32-bits. Which makes sense for immediate writes. Daniel discovered this by writing a pattern into the query object bo and noticing that the high 32-bits were left intact, even on those pipe control writes that seemingly worked. Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Refactor depth count write PIPE_CONTROLs into a helper function.Kenneth Graunke2012-08-081-68/+43
| | | | | | | | | | This consolidates the complexity in one place, which is important because it's about to get even more complicated. Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Emit a CS stall before timestamp writes.Kenneth Graunke2012-08-081-0/+14
| | | | | | | | | | This implements one of the Sandybridge PIPE_CONTROL workarounds. It doesn't appear to be required for Ivybridge. Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Use 64-bit writes for timestamp queries.Kenneth Graunke2012-08-081-2/+3
| | | | | | | | | | | | | | | The hardware seems to use the length of the PIPE_CONTROL command to indicate whether the write is 64-bits or 32-bits. Which makes sense for immediate writes. Daniel discovered this by writing a pattern into the query object bo and noticing that the high 32-bits were left intact, even on those pipe control writes that seemingly worked. Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Refactor timestamp write PIPE_CONTROLs into a helper function.Kenneth Graunke2012-08-081-50/+30
| | | | | | | | | This consolidates the complexity in one place, which is important because it's about to get even more complicated. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel: Make the length for PIPE_CONTROL explicit.Kenneth Graunke2012-08-084-20/+20
| | | | | | | | | | | | | | | | PIPE_CONTROL has variable length, depending upon generation and whether we want to do 32-bit or 64-bit data writes. Make it explicit, rather than hiding a length of 4 in the #define for _3DSTATE_PIPE_CONTROL. Generated by s/3DSTATE_PIPE_CONTROL/3DSTATE_PIPE_CONTROL | (4 - 2)/g. This is equivalent since the #define used to have | 2 in it. A grep through the sources shows that all instances have been converted, so it's safe to remove the | 2 from the #define. Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* swrast: add missing switch case for API_OPENGL_COREBrian Paul2012-08-081-0/+2
| | | | | | To silence compiler warning. Reviewed-by: José Fonseca <[email protected]>
* i965: Enable uniform buffer objects on gen6+.Eric Anholt2012-08-071-0/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vs: Add support for loading uniform buffer variables as pull constants.Eric Anholt2012-08-072-2/+55
| | | | | | | | Unlike the FS side in the previous commit, this does variable indexing just fine, using the same code as we used for other variable-indexed pull constants. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add support for loading uniform buffer variables as pull constants.Eric Anholt2012-08-073-1/+50
| | | | | | | | Variable array indexing isn't finished, because the lowering pass turns it all into conditional moves of constant index accesses so I can't test it. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vs: Add a surface index to VS_OPCODE_PULL_CONSTANT instructions.Eric Anholt2012-08-073-10/+17
| | | | | | | Similar to the previous commit for the fragment shader, now we have a buffer index and an offset. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Communicate the pull constant block read parameters through fs_regs.Eric Anholt2012-08-073-6/+20
| | | | | | | | | | | I wanted to add the surface index as a variable value for UBO support, and a reg seemed like the obvious way to go. This exposes more of the information to CSE, which we'll probably want to apply to pull constant loads for UBOs eventually (you might access 4 floats in a row, each of which would produce an oword block read of the same block). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Bind UBOs as surfaces like we do for pull constants.Eric Anholt2012-08-076-3/+110
| | | | | | v2: Comment fix, drop extraneous parens (review by Kenneth) Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add an offset argument to constant buffer setup.Eric Anholt2012-08-075-6/+11
| | | | | | We'll use this for UBO surfaces. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add a "ubo_load" expression type for fetches from UBOs.Eric Anholt2012-08-073-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drivers will probably want to be able to take UBO references in a shader like: uniform ubo1 { float a; float b; float c; float d; } void main() { gl_FragColor = vec4(a, b, c, d); } and generate a single aligned vec4 load out of the UBO. For intel, this involves recognizing the shared offset of the aligned loads and CSEing them out. Obviously that involves breaking things down to loads from an offset from a particular UBO first. Thus, the driver doesn't want to see variable_ref(ir_variable("a")), and even more so does it not want to see array_ref(record_ref(variable_ref(ir_variable("a")), "field1"), variable_ref(ir_variable("i"))). where a.field1[i] is a row_major matrix. Instead, we're going to make a lowering pass to break UBO references down to expressions that are obvious to codegen, and amenable to merging through CSE. v2: Fix some partial thoughts in the ir_binop comment (review by Kenneth) Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Replace VersionMajor/VersionMinor with a Version field.Eric Anholt2012-08-074-12/+4
| | | | | | | | | | | As we get into supporting GL 3.x core, we come across more and more features of the API that depend on the version number as opposed to just the extension list. This will let us more sanely do version checks than "(VersionMajor == 3 && VersionMinor >= 2) || VersionMajor >= 4". v2: Fix a bad <= 30 check. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Fix compiler warnings from winsys msaa.Eric Anholt2012-08-072-3/+1
|
* intel: Advertise multisample DRI2 configs on gen >= 6Chad Versace2012-08-071-3/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This turns on window system MSAA. This patch changes the id of many GLX visuals and configs, but that couldn't be prevented. I attempted to preserve the id's of extant configs by appending the multisample configs to the end of the extant ones. But somewhere, perhaps in the X server, the configs are reordered with multisample configs interspersed among the singlesample ones. Test results: Tested with xonotic and `glxgears -samples 1` on Ivybridge. No piglit regressions on Ivybridge. On Sandybridge, passes 68/70 of oglconform's winsys multisample tests. The two failing tests are: multisample(advanced.pixelmap.depth) multisample(advanced.pixelmap.depthCopyPixels) These tests hang the gpu (on kernel 3.4.6) due to a glDrawPixels/glReadPixels pair on an MSAA depth buffer. I don't expect realworld apps to do that, so I'm not too concerned about the hang. On Ivybridge, passes 69/70. The failing case is multisample(advanced.line.changeWidth). Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Clarify intel_screen_make_configsChad Versace2012-08-071-20/+16
| | | | | | | | | | | | | | This function felt sloppy, so this patch cleans it up a little bit. - Rename `color` to `i`. It is not a color value, only an iterator int. - Move `depth_bits[0] = 0` into the non-accum loop because that is where it used. The accum loop later overwrites depth_bits[0]. - Rename `depth_factor` to `num_depth_stencil_bits`. - Redefine `msaa_samples_array` as static const because it is never modified. Rename to `singlesample_samples`. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* dri: Simplify use of driConcatConfigsChad Versace2012-08-074-14/+9
| | | | | | | | | | | | If either argument to driConcatConfigs(a, b) is null or the empty list, then simply return the other argument as the resultant list. All callers were accomplishing that same behavior anyway. And each caller accopmplished it with the same pattern. So this patch moves that external pattern into the function. Reviewed-by: <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Refactor creation of DRI2 configsChad Versace2012-08-071-91/+98
| | | | | | | | | | DRI2 configs were constructed in intelInitScreen2. That function already does too much, so move verbatim the code for creating configs to a new function, intel_screen_make_configs. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Downsample on DRI2 flushChad Versace2012-08-071-0/+31
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Support mapping multisample miptreesChad Versace2012-08-072-6/+126
| | | | | | | | | Add two new functions: intel_miptree_{map,unmap}_multisample, to which intel_miptree_{map,unmap} dispatch. Only mapping flat, renderbuffer-like miptrees are supported. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Refactor use of intel_miptree_mapChad Versace2012-08-071-15/+50
| | | | | | | | | | Move the opencoded construction and destruction of intel_miptree_map into new functions, intel_miptree_attach_map and intel_miptree_release_map. This patch prevents code duplication in a future commit that adds support for mapping multisample miptrees. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Refactor intel_miptree_map/unmapChad Versace2012-08-071-17/+50
| | | | | | | | | | | Move the body of intel_miptree_map into a new function, intel_miptree_map_singlesample. Now intel_miptree_map dispatches to the new function. A future commit adds a multisample variant. Ditto for intel_miptree_unmap. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Mark needed downsamples for msaa winsys buffersChad Versace2012-08-074-6/+29
| | | | | | | | | | | | | Add function intel_renderbuffer_set_needs_downsample. It is a no-op except on multisample winsys buffers shared with DRI2. Mark the needed downsamples with the new function at two locations: - Immediately after drawing is complete. - After blitting. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Define functions for up/downsampling on miptreesChad Versace2012-08-071-2/+72
| | | | | | | Flesh out the stub functions intel_miptree_{up,down}sample. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Add function brw_blorp_blit_miptreesChad Versace2012-08-072-4/+37
| | | | | | | | | | Define a function, brw_blorp_blit_miptrees, that simply wraps brw_blorp_blit_params + brw_blorp_exec with C calling conventions. This enables intel_miptree.c, in a following commit, to perform blits with blorp for the purpose of downsampling multisample miptrees. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Allocate miptree for multisample DRI2 buffersChad Versace2012-08-073-8/+162
| | | | | | | | | | | | | | | | | | | | Immediately after obtaining, with DRI2GetBuffersWithFormat, the DRM buffer handle for a DRI2 buffer, we wrap that DRM buffer handle with a region and a miptree. This patch additionally allocates an accompanying multisample miptree if the DRI2 buffer is multisampled. Since we do not yet advertise multisample GL configs, the code for allocating the multisample miptree is currently inactive. This patch adds the following fields to intel_mipmap_tree: singlesample_mt needs_downsample and the following function stubs: intel_miptree_downsample intel_miptree_upsample Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Refactor creation of hiz and mcs miptreesChad Versace2012-08-072-16/+19
| | | | | | | | | | | | | | Move the logic for creating the ancillary hiz and mcs miptress for winsys and non-texture renderbuffers from intel_alloc_renderbuffer_storage to intel_miptree_create_for_renderbuffer. Let's try to isolate complex miptree logic to intel_mipmap_tree.c. Without this refactor, code duplication would be required along the intel_process_dri2_buffer codepath in order to create the mcs miptree. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Set num samples for winsys renderbuffersChad Versace2012-08-073-11/+21
| | | | | | | | | | | | | | Add a new param, num_samples, to intel_create_renderbuffer and intel_create_private_renderbuffer. No multisample GL config is yet advertised, so the value of num_samples is currently 0. For server-owned winsys buffers, gl_renderbuffer::NumSamples is not yet used. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]> (v1) Signed-off-by: Chad Versace <[email protected]>
* intel: Refactor quantize_num_samplesChad Versace2012-08-072-3/+7
| | | | | | | | | | | Rename quantize_num_samples to intel_quantize_num_samples and change the first param from struct intel_context* to struct intel_screen*. The function will later be used by intelCreateBuffer, which is not bound to any context but is bound to a screen. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]> (v1) Signed-off-by: Chad Versace <[email protected]>
* intel: Update stale comment for intel_miptree_slice::mapChad Versace2012-08-071-2/+2
| | | | | | | The comment referred to intel_tex_image_map/unmap, but should more accurately refer to intel_miptree_map/unmap. Signed-off-by: Chad Versace <[email protected]>
* i965: add more Haswell PCI IDsPaulo Zanoni2012-08-072-4/+98
| | | | | | Signed-off-by: Paulo Zanoni <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* dri2: Fix bug in attribute handling for non-desktop OpenGL contextsIan Romanick2012-08-061-6/+17
| | | | | | | | | | | | | | | Previously an error would be generated if any attributes were specified when creating a non-desktop OpenGL context. This was a mistake, and it will prevent old drivers from working with new EGL libraries that add support for the createContextAttribs interface. Instead, match the behavior of EGL_KHR_create_context: allow versions that make sense, reject non-zero flags. NOTE: This is a candidate for the 8.0 branch. Signed-off-by: Ian Romanick <[email protected]> Cc: Kristian Høgsberg <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Allocate dummy slots for point sprites before computing VUE map.Kenneth Graunke2012-08-061-2/+2
| | | | | | | | | | | | | | Commit f0cecd43d6b6d moved the VUE map computation to be only once, at VS compile time. However, it did so in slightly the wrong place: it made the one call to brw_vue_compute_map happen right before the allocation of dummy slots for replaced point sprite coordinates, causing a different VUE map to be generated (at least on Ironlake). Fixes a regression in Piglit's point-sprite test on Ironlake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46489 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/vs: Don't clobber sampler message MRFs with subexpressions.Kenneth Graunke2012-08-061-17/+42
| | | | | | | | | | | | See the preceding commit for a description of the problem. NOTE: This is a candidate for stable release branches. v2: Use a separate dPdx variable rather than reusing the lod src_reg. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Don't clobber sampler message MRFs with subexpressions.Kenneth Graunke2012-08-062-70/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider a texture call such as: textureLod(s, coordinate, log2(...)) First, we begin setting up the sampler message by loading the texture coordinates into MRFs, starting with m2. Then, we realize we need the LOD, and go to compute it with: ir->lod_info.lod->accept(this); On Gen4-5, this will generate a SEND instruction to compute log2(), loading the operand into m2, and clobbering our texcoord. Similar issues exist on Gen6+. For example, nested texture calls: textureLod(s1, c1, texture(s2, c2).x) Any texturing call where evaluating the subexpression trees for LOD or shadow comparitor would generate SEND instructions could potentially break. In some cases (like register spilling), we get lucky and avoid the issue by using non-overlapping MRF regions. But we shouldn't count on that. Fixes four Piglit test regressions on Gen4-5: - glsl-fs-shadow2DGradARB-{01,04,07,cumulative} NOTE: This is a candidate for stable release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Factor out texcoord setup into a helper function.Kenneth Graunke2012-08-062-11/+28
| | | | | | | | | With the textureRect support and GL_CLAMP workarounds, it's grown sufficiently that it deserves its own function. Separating it out makes the original function much more readable. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>