summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965/blorp: Use 8k chunk size for urb allocationTopi Pohjolainen2016-04-211-5/+14
| | | | | | | | | | | | | | | | Previously, we hardcoded "VS URB Starting Address" to 2 (in 8kB chunks), which meant VS URB data would start at an offset of 16kB. However, on Haswell GT3 and Gen8+, we allocate the first 32kB for the push constant region. This means that the PS push constant and VS URB data regions overlap, which can lead to corruption. v2 (Ken): Better description of the change, and do not change vs_size from 2 to 1. Cc: [email protected] Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp/gen7: Prepare re-using for gen8Topi Pohjolainen2016-04-211-2/+4
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp: Let compiler calculate the vertex buffer sizeTopi Pohjolainen2016-04-211-21/+10
| | | | | | | | | Currently the size is sizeof(float) times too large. One reserves GEN6_BLORP_VBO_SIZE many floats whereas GEN6_BLORP_VBO_SIZE stands for the size of vertex buffer in bytes. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen8: Expose state base address setupTopi Pohjolainen2016-04-212-2/+4
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen8: Expose surface state helpersTopi Pohjolainen2016-04-212-25/+41
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen9: Use correct size for DS_STATETopi Pohjolainen2016-04-211-4/+18
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix interpolateAtSample() on single sampled buffers.Kenneth Graunke2016-04-201-0/+15
| | | | | | | | | | Fixes dEQP-GLES31.functional.shaders.multisample_interpolation tests: - interpolate_at_sample.non_multisample_buffer.sample_n_default_framebuffer - interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_rbo - interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_texture Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Fix gl_SampleMaskIn[] in per-sample shading mode.Kenneth Graunke2016-04-203-2/+42
| | | | | | | | | | | | | | | | The coverage mask is not sufficient - in per-sample mode, we also need to AND with a mask representing the samples being processed by the current fragment shader invocation. Fixes 18 dEQP-GLES31.functional.shaders.sample_variables tests: sample_mask_in.bit_count_per_sample.multisample_{rbo,texture}_{1,2,4,8} sample_mask_in.bit_count_per_two_samples.multisample_{rbo,texture}_{4,8} sample_mask_in.bits_unique_per_sample.multisample_{rbo,texture}_{1,2,4,8} sample_mask_in.bits_unique_per_two_samples.multisample_{rbo,texture}_{4,8} Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Only enable oMask output when there's a multisample FBO.Kenneth Graunke2016-04-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ARB_sample_shading specification says that setting gl_SampleMask bits to 0 means that the corresponding sample "should be considered uncovered for the purposes of multisample fragment operations (Section 4.1.3)." The OpenGL 4.4 specification, section 17.3.3 ("Multisample Fragment Operations") specifies: "No changes to the fragment alpha or coverage values are made at this step if MULTISAMPLE is disabled, or if the value of SAMPLE_BUFFERS is not one." oMask output alters coverage masks and can kill pixels. We need to disable it in the above case, which conveniently corresponds to key->multisample_fbo being false. Khronos bug #12188 also spells this out clearly: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=12188 Fixes two Piglit tests: tests/spec/arb_sample_shading/builtin-gl-sample-mask-simple 0 tests/spec/arb_sample_shading/builtin-gl-sample-mask 0 Fixes 21 ES3 conformance tests: ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_0 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_1 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_7 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_4 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_5 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_7 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_4 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_6 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_0 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_5 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_7 Fixes 9 dEQP-GLES31.functional.shaders.sample_variables tests: sample_mask.discard_half_per_pixel.default_framebuffer sample_mask.discard_half_per_pixel.singlesample_rbo sample_mask.discard_half_per_pixel.singlesample_texture sample_mask.discard_half_per_sample.default_framebuffer sample_mask.discard_half_per_sample.singlesample_rbo sample_mask.discard_half_per_sample.singlesample_texture sample_mask.discard_half_per_two_samples.default_framebuffer sample_mask.discard_half_per_two_samples.singlesample_rbo sample_mask.discard_half_per_two_samples.singlesample_texture Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Generalize wm_key->compute_sample_id to wm_key->multisample_fbo.Kenneth Graunke2016-04-203-7/+6
| | | | | | | | | | | | | | | | I'm going to need a key entry meaning "we have a multisample FBO, and multisampling is enabled" in an upcoming patch. This is basically wm_key->compute_sample_id, except that it also checks that the SAMPLE_ID system value is read. The only use of wm_key->compute_sample_id is in emit_sampleid_setup(), which is only called when handling the SAMPLE_ID system value. So we can just eliminate the check and generalize the field. v2: Also update the Vulkan driver. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Delete now dead persample_2x FS program key flag.Kenneth Graunke2016-04-202-5/+0
| | | | | | | | | | This was only used by the old gl_SampleID calculations. The new code doesn't need to handle 2x specially. v2: Delete it from the Vulkan driver, too. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Simplify gl_SampleID setup on Gen8+.Kenneth Graunke2016-04-201-5/+37
| | | | | | | | | | | | | | | | | | On Gen7+, the thread payload provides the sample ID - we can read it in two instructions, without any elaborate calculations. We don't even need a state dependency - this will properly produce zero in the non-MSAA case. Unfortunately, we need the state flag anyway, so we may as well continue to use it to produce a single MOV 0 instead of SHR/AND. For some reason, the sample ID field is always zero on Gen7/7.5, so we can't use this yet. However, it works fine on Gen8+. So, land the code and use it where it's working, and leave a TODO for later. v2: Fix register types in the comment (caught by Matt Turner!). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Flip key->compute_sample_id check.Kenneth Graunke2016-04-201-7/+7
| | | | | | | This just moves the simple case first. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Properly handle integer types in opt_vector_float().Kenneth Graunke2016-04-201-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, opt_vector_float() always interpreted MOV sources as floating point, and always created a MOV with a F-type destination. This meant that we could mess up sequences of integer loads, such as: mov vgrf6.0.x:D, 0D mov vgrf6.0.y:D, 1D mov vgrf6.0.z:D, 2D mov vgrf6.0.w:D, 3D Here, integer 0/1/2/3 become approximately 0.0f, so we generated: mov vgrf6.0:F, [0F, 0F, 0F, 0F] which is clearly wrong. We can properly handle this by converting integer values to float (rather than bitcasting), and emitting a type converting MOV: mov vgrf6.0:D, [0F, 1F, 2F, 3F] To do this, see first see if the integer values (converted to float) are representable. If so, we use a D-type MOV. If not, we then try the floating point values and an F-type MOV. We make zero not impose type restrictions. This is important because 0D would imply a D-type MOV, but is often used in sequences such as MOV 0D, MOV 0x3f800000D, where we want to use an F-type MOV. Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend. This recently became visible due to changes in opt_vector_float() which made it optimize more cases, but it was a pre-existing bug. Apparently it also manages to turn more integer loads into VFs, producing the following shader-db statistics on Haswell: total instructions in shared programs: 7084195 -> 7082191 (-0.03%) instructions in affected programs: 246027 -> 244023 (-0.81%) helped: 1937 total cycles in shared programs: 65669642 -> 65651968 (-0.03%) cycles in affected programs: 531064 -> 513390 (-3.33%) helped: 1177 v2: Handle the type of zero better. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Make opt_vector_float() only handle non-type-conversion MOVs.Kenneth Graunke2016-04-201-2/+5
| | | | | | | | | | | | | | | | We don't handle this properly - we'd have to perform the type conversion before trying to convert the value to a VF. While we could do that, it doesn't seem particularly useful - most vector loads should be consistently typed (all float or all integer). As a special case, we do allow type-converting MOVs of integer 0, as it's represented the same regardless of the type. I believe this case does actually come up. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Fold vectorize_mov() back into the one caller.Kenneth Graunke2016-04-202-28/+16
| | | | | | | | | | After the previous patch, this helper is only called in one place. So, just fold it back in - there are a lot of parameters here and not much code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Rework opt_vector_float() control flow.Kenneth Graunke2016-04-201-27/+34
| | | | | | | | | | | | | This reworks opt_vector_float() so that there's only one place that flushes out any accumulated state and emits a VF. v2: Don't break the sequence for non-representable numbers - just skip recording their values. Only break it for non-MOVs or register changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: rename nir_foreach_block*() to nir_foreach_block*_call()Connor Abbott2016-04-206-10/+10
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* i965/vec4: Always split uniforms in array_access_to_pull_constantsJason Ekstrand2016-04-201-1/+3
| | | | | | | | | | | | | | | | Normally, we split uniforms at the end but in Vulkan, we bail because we don't want pull constants. However, we still need them split because pack_uniforms relies on it. I really don't like this patch not because it doesn't work (it does) but because now that we're using MOV_INDIRECT, uniform numbers and sizes don't really matter anymore. In the FS backend, uniform splitting and packing is handled all at once (actual re-assignment of locations happens later) and we really should do it that way in vec4 eventually as well. Reviewed-by: Iago Toral Quiroga <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
* i965/vec4: Use the correct offset for the swizzle shift in push constantsJason Ekstrand2016-04-201-1/+1
| | | | | | | | | This was actually caught by Ken in review the first time around but somehow didn't get fixed before the patches were pushed. :-( Reviewed-by: Iago Toral Quiroga <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
* i965/vec4: Use nir_intrinsic_base in the load_uniform implementationJason Ekstrand2016-04-201-1/+1
| | | | | | | | We shouldn't be reading the const_index directly Reviewed-by: Iago Toral Quiroga <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
* i965: Define miptree map functions static (trivial)Ben Widawsky2016-04-181-2/+2
| | | | | | | | | | | | | They were already declared as such. It was changed here: commit 31f0967fb50101437d2568e9ab9640ffbcbf7ef9 Author: Ian Romanick <[email protected]> Date: Wed Sep 2 14:43:18 2015 -0700 i965: Make intel_miptree_map_raw static Cc: Ian Romanick <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* meta: Don't botch color masks when changing drawbuffers.Kenneth Graunke2016-04-181-7/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Color clears should respect each drawbuffer's color mask state. Previously, we tried to leave the color mask untouched. However, _mesa_meta_drawbuffers_from_bitfield() ended up rebinding all the color drawbuffers in a different order, so we ended up pairing drawbuffers with the wrong color mask state. The new _mesa_meta_drawbuffers_and_colormask() function does the same job as the old _mesa_meta_drawbuffers_from_bitfield(), but also rearranges the color mask state to match the new drawbuffer configuration. This code was largely ripped off from Gallium's st_Clear code. This fixes ES31-CTS.draw_buffers_indexed.color_masks, which binds up to 8 drawbuffers, sets color masks for each, and then calls glClearBufferfv to clear each buffer individually. ClearBuffer causes us to rebind only one drawbuffer, at which point we used ctx->Color.ColorMask[0] (draw buffer 0's state) for everything. We could probably delete _mesa_meta_drawbuffers_from_bitfield(), but I'd rather not think about the i965 fast clear code. Topi is rewriting a bunch of that soon anyway, so let's delete it then. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94847 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* meta: Don't smash ColorMask when using MESA_META_COLOR_MASK save bit.Kenneth Graunke2016-04-182-5/+4
| | | | | | | | | | | This allows meta operations to inspect the existing color mask, and then do their own smashing. BlitFramebuffer and Clear already override the color mask, so this was also redundant. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Don't allow OOB array access of imagesJason Ekstrand2016-04-151-15/+11
| | | | | | | | | | | | | | | | We have had a guard against OOB array access of images on IVB for a long time, but it can actually cause hangs on any GPU generation. This can happen due to getting an untyped SURFACE_STATE for a typed message. We didn't used to hit this with the piglit test on anything other than IVB because the OOB in the test would cause us to go past the top of the pull constant UBO and we would get a surface index of 0 which is was always a valid surface. Now that we're pushing small arrays, we can end up grabbing garbage from the GRF and going to some random index which causes a hang. The solution is to just do the bounds check on all hardware. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94944 Reviewed-by: Francisco Jerez <[email protected]> Tested-by: Mark Janes <[email protected]>
* i965/vec4: Support full std140 layout for push constantsJason Ekstrand2016-04-151-5/+25
| | | | | | | | | Up until now, we have been able to assume that all push constants are vec4-aligned because this is what the GL driver gives us. In Vulkan, we need to be able to support full std140 because we get the layout from the client. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Handle MOV_INDIRECT in pack_uniform_registersJason Ekstrand2016-04-151-0/+18
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECTJason Ekstrand2016-04-152-0/+68
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Use can_do_writemask in can_reswizzleJason Ekstrand2016-04-151-3/+5
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Move can_do_writemask to vec4_instructionJason Ekstrand2016-04-153-30/+30
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/surface_formats: Update some formats for more recent gensJason Ekstrand2016-04-151-12/+12
| | | | | | | | The surface format table hasn't entirely been kept up-to-date. This commit marks a couple more compressed formats as sampleable on gen8+ and adds the A4B4G4R4 format as renderable on gen9. Reviewed-by: Kenneth Graunke <[email protected]>
* xlib: remove MESA_GLX_VISUAL_HACKJohn Sheu2016-04-151-23/+19
| | | | | | | | | | | | This removes a hack introduced in 1999 in the first version of fakeglx.c, with the comment: /* XXX revisit this after 3.0 is finished. */ Mesa 4.0 was released in 2001. It is now 2016, and Mesa 11.0 was released last year. Reviewed-by: Alejandro Piñeiro <[email protected]>
* xlib: fix leaks of returned values from XGetVisualInfoJohn Sheu2016-04-151-8/+21
| | | | Reviewed-by: Alejandro Piñeiro <[email protected]>
* xlib: fix memory leak of and remove vishandle from XMesaVisualInfoJohn Sheu2016-04-152-39/+24
| | | | | | | | | | | | | | The vishandle member of XMesaVisualInfo is used to support the comparison of XVisualInfo instances by pointer value, in find_glx_visual(). The comparison however will always be false, as in every case the comparison is made, the VisualInfo instance being compared to is a new allocation passed in through a GLX API call. In addition, the XVisualInfo instance pointed to by vishandle is itself never freed, causing a memory leak. Since vishandle is essentially useless, we just remove it and thereby also fix the leak. Reviewed-by: Alejandro Piñeiro <[email protected]>
* xlib: do not cache return value of glXChooseVisual/glXGetVisualFromFBConfigJohn Sheu2016-04-151-18/+8
| | | | | | | | | | | | | | | The returned XVisualInfo from glXChooseVisual/glXGetVisualFromFBConfig is being cached in XMesaVisual.vishandle (and unconditionally overwritten on subsequent calls). However, these entry points are specified to return XVisualInfo instances to be owned by the caller and freed with XFree(), so the return values should not be retained. With this change, XMesaVisual.vishandle is essentially unused and will be removed in a subsequent change. v2: update commit message Reviewed-by: Alejandro Piñeiro <[email protected]>
* i965: Expose the surface format tableJason Ekstrand2016-04-143-18/+48
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* dri: Fix robust context creation via EGL attributeChad Versace2016-04-141-2/+23
| | | | | | | | | | | driCreateContextAttribs() emits an error if bit __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS is set for an ES context. But, EGL_EXT_create_context_robustness and EGL 1.5 both allow creation of robust ES contexts. One requests a robust ES context by setting the EGL_CONTEXT_OPENGL_ROBUST_ACCESS *attribute*, which Mesa's EGL layer translates into the __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS *bit*. Reviewed-by: Marek Olšák <[email protected]>
* i965: Push everything if pull_param == NULLJason Ekstrand2016-04-142-2/+14
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Push small uniform arraysJason Ekstrand2016-04-141-23/+53
| | | | | | | | | | | | | Unfortunately, this also means that we need to use a slightly different algorithm for assign_constant_locations. The old algorithm worked based on the assumption that each read of a uniform value read exactly one float. If it encountered a MOV_INDIRECT, it would immediately bail and push the whole thing. Since we can now read ranges using MOV_INDIRECT, we need to be able to push a series of floats without breaking them up. To do this, we use an algorithm similar to the on in split_virtual_grfs. Reviewed-by: Kristian Høgsberg <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965/fs: Rename demote_pull_constants to lower_constant_loadsJason Ekstrand2016-04-142-3/+3
| | | | | Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Get rid of the uniform_size arrayJason Ekstrand2016-04-146-33/+0
| | | | | Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constantsJason Ekstrand2016-04-144-51/+50
| | | | | | | | | | | | | | | This commit moves us to an instruction based model rather than a register-based model for indirects. This is more accurate anyway as we have to emit instructions to resolve the reladdr. It's also a lot simpler because it gets rid of the recursive reladdr problem by design. One side-effect of this is that we need a whole new algorithm in move_uniform_array_access_to_pull_constants. This new algorithm is much more straightforward than the old one and is fairly similar to what we're already doing in the FS backend. Reviewed-by: Kristian Høgsberg <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965/fs: Get rid of the param_size arrayJason Ekstrand2016-04-144-15/+0
| | | | | Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Stop relying on param_size in assign_constant_locationsJason Ekstrand2016-04-141-27/+17
| | | | | | | | | | | | | Now that we have MOV_INDIRECT opcodes, we have all of the size information we need directly in the opcode. With a little restructuring of the algorithm used in assign_constant_locations we don't need param_size anymore. The big thing to watch out for now, however, is that you can have two ranges overlap where neither contains the other. In order to deal with this, we make the first pass just flag what needs pulling and handle assigning pull constant locations until later. Reviewed-by: Kristian Høgsberg <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965/fs: Get rid of reladdrJason Ekstrand2016-04-142-10/+2
| | | | | | | We aren't using it anymore. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use MOV_INDIRECT for all indirect uniform loadsJason Ekstrand2016-04-142-40/+87
| | | | | | | | | | | Instead of using reladdr, this commit changes the FS backend to emit a MOV_INDIRECT whenever we need an indirect uniform load. We also have to rework some of the other bits of the backend to handle this new form of uniform load. The obvious change is that demote_pull_constants now acts more like a lowering pass when it hits a MOV_INDIRECT. Reviewed-by: Kristian Høgsberg <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardwareJason Ekstrand2016-04-142-13/+66
| | | | | | | | | | While we're at it, we also add support for the possibility that the indirect is, in fact, a constant. This shouldn't happen in the common case (if it does, that means NIR failed to constant-fold something), but it's possible so we should handle it. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix regs_read() for MOV_INDIRECT with a non-zero subnrJason Ekstrand2016-04-141-1/+1
| | | | | | | The subnr field is in bytes so we don't need to multiply by type_sz. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Don't force MASK_DISABLE on INDIRECT_MOV instructionsJason Ekstrand2016-04-141-1/+0
| | | | | | | It should work fine without it and the visitor can set it if it wants. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add support for doing MOV_INDIRECT on uniformsJason Ekstrand2016-04-141-1/+4
| | | | | Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>