summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri
Commit message (Collapse)AuthorAgeFilesLines
* i965: Update the surface_format table for ETC formatsNanley Chery2016-01-271-11/+11
| | | | | | | Enable ETC support for BDW+. In Vulkan, an array lookup on surface_format[] is used to determine HW support for certain formats. In contrast, Mesa dynamically populates an array which reports this information.
* i965: Fix SIN/COS precision problems.Kenneth Graunke2016-01-272-12/+40
| | | | Signed-off-by: Kenneth Graunke <[email protected]>
* HACK/i965/surface_formats: Mark A4B4G4R4 as being supportedJason Ekstrand2016-01-261-1/+1
| | | | | | | | | The table has this marked as unsupported on all gens, but I don't really believe that given how early it is in the table. I've tested and it seems to work on Broadwell. The Bspec says that it sould be renderable on SKL+ but alpha blending is questionable. Side note: We really need to audit the format table again.
* i965/skl: Utilize new 5th bit for gateway messagesBen Widawsky2016-01-261-1/+3
| | | | | Cc: Jordan Justen <[email protected]> Signed-off-by: Ben Widawsky <[email protected]>
* i965/fs_surface_builder: Mask signed integers after conversionJason Ekstrand2016-01-261-0/+18
|
* i965/compiler: Set nir_options.vertex_id_zero_basedJason Ekstrand2016-01-251-1/+2
|
* HACK/i965: Default to scalar GS on BDW+Jason Ekstrand2016-01-251-1/+1
|
* Merge remote-tracking branch 'mattst88/nir-lower-pack-unpack' into vulkanJason Ekstrand2016-01-2522-193/+475
|\
| * i965/gen7+: Use NIR for lowering of pack/unpack opcodes.Matt Turner2016-01-253-19/+29
| |
| * i965/vec4: Implement nir_op_pack_uvec2_to_uint.Matt Turner2016-01-251-0/+18
| | | | | | | | | | And mark nir_op_pack_uvec4_to_uint unreachable, since it's only produced by lowering pack[SU]norm4x8 which the vec4 backend does not need.
| * i965/fs: Implement support for extract_word.Matt Turner2016-01-255-0/+56
| | | | | | | | The vec4 backend will lower it.
| * glsl: Remove 2x16 half-precision pack/unpack opcodes.Matt Turner2016-01-251-3/+0
| | | | | | | | i965/fs was the only consumer, and we're now doing the lowering in NIR.
| * i965/fs: Switch from GLSL IR to NIR for un/packHalf2x16 lowering.Matt Turner2016-01-253-11/+7
| |
| * i965: Make separate nir_options for scalar/vector stages.Matt Turner2016-01-251-28/+33
| | | | | | | | | | We'll want to have different lowering options set for scalar/vector stages.
| * i965: Move brw_compiler_create() to new brw_compiler.c.Matt Turner2016-01-255-133/+161
| | | | | | | | | | A future patch will want to use designated initalizers, which aren't available in C++, but this is C.
| * i965: Implement compute sampler state atom.Francisco Jerez2016-01-194-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes a number of GLES31 CTS failures and hangs on various hardware: ES31-CTS.texture_gather.plain-gather-depth-2d ES31-CTS.texture_gather.plain-gather-depth-2darray ES31-CTS.texture_gather.plain-gather-depth-cube ES31-CTS.texture_gather.offset-gather-depth-2d ES31-CTS.texture_gather.offset-gather-depth-2darray ES31-CTS.layout_binding.sampler2D_layout_binding_texture_ComputeShader ES31-CTS.layout_binding.sampler2DArray_layout_binding_texture_ComputeShader ES31-CTS.explicit_uniform_location.uniform-loc-types-samplers ES31-CTS.compute_shader.resources-texture Some of them were actually passing by luck on some generations even though we weren't uploading sampler state tables explicitly for the compute stage, most likely because they relied on the cached sampler state left from previous rendering to be close enough. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92589 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93312 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93325 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93407 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93725 Reported-by: Marta Lofstedt <[email protected]> Reviewed-by: Marta Lofstedt <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
| * i965: Trigger CS state reemission when new sampler state is uploaded.Francisco Jerez2016-01-192-1/+2
| | | | | | | | | | | | | | | | | | | | This reuses the NEW_SAMPLER_STATE_TABLE state bit (currently only used on pre-Gen7 hardware) to signal that the sampler state tables have changed in order to make sure that the GPGPU interface descriptor is updated. Reviewed-by: Marta Lofstedt <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
| * i965/vec4: Spaces around operators.Matt Turner2016-01-191-1/+1
| |
| * i965: Inform compiler of variable range to silence warning.Matt Turner2016-01-191-1/+2
| | | | | | | | | | | | | | Extends commit 6531ccb70 to silence the warning in release builds as well. Reviewed-by: Ilia Mirkin <[email protected]>
| * i965: adding missing headers to the dist tarballEmil Velikov2016-01-181-0/+2
| | | | | | | | Signed-off-by: Emil Velikov <[email protected]>
| * i965/fs: Always set channel 2 of texture headers in some stagesJason Ekstrand2016-01-151-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the vertex and fragment stages, the hardware is nice to us and leaves g0.2 zerod out for us so we can use it for headers. However, in compute, geometry, and tessellation stages, the hardware is not so nice. In particular, for compute shaders on BDW, the hardware places some debug bits in 23:15. As it happens, bit 15 is interpreted by the sampler as the alpha channel mask. This means that if you use a texturing instruction with a header in a compute shader, you may randomly get the alpha channel disabled. Since channel masks affect the return length of the sampler message, this can lead the GPU to expect a different mlen to the one you specified in the shader and this, in turn, hangs your GPU. Cc: "11.1" <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
| * i965/fs/generator: Take an actual shader stage rather than a stringJason Ekstrand2016-01-157-11/+14
| | | | | | | | | | | | Cc: "11.1" <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
| * i965/vec4: Use UW type for multiply into accumulator on GEN8+Jason Ekstrand2016-01-151-1/+5
| | | | | | | | | | | | | | | | BDW adds the following restriction: "When multiplying DW x DW, the dst cannot be accumulator." Cc: "11.1,11.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
| * i965: Apply add_const_offset_to_base for vec4 VS inputs too.Kenneth Graunke2016-01-141-5/+5
| | | | | | | | | | | | | | | | This shouldn't hurt anything, and I'm about to introduce a pass that will want it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
| * i965: Make add_const_offset_to_base() work at the shader level.Kenneth Graunke2016-01-141-17/+21
| | | | | | | | | | | | | | | | This makes it a pass, hiding the parameter structs and block callbacks so it's simpler to work with. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
| * i965: Make an is_scalar boolean in brw_compile_vs().Kenneth Graunke2016-01-141-5/+5
| | | | | | | | | | | | | | | | Shorter than compiler->scalar_stage[MESA_SHADER_VERTEX], which can help with line-wrapping. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
| * i965/gen7.5+: Disable resource streamer during GPGPU workloads.Francisco Jerez2016-01-143-1/+42
| | | | | | | | | | | | | | | | | | | | The RS and hardware binding tables are only supported on the 3D pipeline and can lead to corruption if left enabled during a GPGPU workload. Disable it when switching to the GPGPU (or media) pipeline and re-enable it when switching back to the 3D pipeline. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Abdiel Janulgue <[email protected]>
| * i965/gen7: Emit stall and dummy primitive draw after switching to the 3D ↵Francisco Jerez2016-01-141-0/+24
| | | | | | | | | | | | | | | | | | pipeline. This hardware bug can supposedly lead to a hang on IVB and VLV. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
| * i965/gen4-5: Emit MI_FLUSH as required prior to switching pipelines.Francisco Jerez2016-01-141-0/+13
| | | | | | | | | | | | | | | | | | AFAIK brw_emit_select_pipeline() is only called once during context init on Gen4-5, at which point the pipeline is likely to be already idle so it may just happen to work by luck regardless of the MI_FLUSH. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
| * i965/gen6-7: Implement stall and flushes required prior to switching pipelines.Francisco Jerez2016-01-141-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switching the current pipeline while it's not completely idle or the read and write caches aren't flushed can lead to corruption. Fixes misrendering of at least the following Khronos CTS test: ES31-CTS.shader_image_load_store.basic-allTargets-store-fs The stall and flushes are no longer required on Gen8+. v2: Emit PIPE_CONTROL with non-zero post-sync op before the write cache flush on SNB due to hardware bug. (Ken) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93323 Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
| * i965/gen8+: Invalidate color calc state when switching to the GPGPU pipeline.Francisco Jerez2016-01-141-0/+20
| | | | | | | | | | | | | | | | | | | | | | This hardware bug can cause a hang on context restore while the current pipeline is set to GPGPU (BDWGFX HSD 1909593). In addition to clearing the valid bit, mark the CC state as dirty to make sure that the CC indirect state pointer is re-emitted when we switch back to the 3D pipeline. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
| * i965: Add state bit to trigger re-emission of color calculator state.Francisco Jerez2016-01-143-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | This will be used on Gen8+ to make sure that the color calculator state pointers are re-emitted when switching back to the 3D pipeline after some GPGPU workload due to a hardware workaround. There are other state bits already defined that could be used to achieve the same effect but they all cause a ton of unrelated state to be re-emitted (e.g. BRW_NEW_STATE_BASE_ADDRESS), so just define a new one, state bits are cheap. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* | i965/fs: Feel free to spill partial reads/writesJason Ekstrand2016-01-251-19/+2
| | | | | | | | Now that we properly handle write-masking, this should be safe.
* | i965/fs: Properly write-mask spillsJason Ekstrand2016-01-252-5/+11
| | | | | | | | | | | | | | | | | | | | | | For unspills (scratch reads), we can just set WE_all all the time because we always unspill into a new GRF. For spills, we have two options: If the instruction has a 32-bit-per-channel destination and "normal" regioning, then we just do a regular write and it will interleave channels from different control-flow paths properly. If, on the other hand, the the regioning is non-normal, then we have to unspill, run the instruction, and spill afterwards. In this second case, we need to do the spill with we_ALL.
* | i965/nir: Properly flush denormals in nir_op_fquantize2f16Jason Ekstrand2016-01-222-10/+41
| |
* | i965/nir: Temporariliy disable mul+add fusionJason Ekstrand2016-01-221-1/+1
| | | | | | | | | | | | We don't want to do this in the long-run but it's needed for passing the NoContraction tests at the moment. Eventually, we want to plumb this through NIR properly.
* | vk: Fix indirect push constantsKristian Høgsberg Kristensen2016-01-211-4/+3
| | | | | | | | | | | | | | This currently sets the base and size of all push constants to the entire push constant block. The idea is that we'll use the base and size to eventually optimize the amount we actually push, but for now we don't do that.
* | Merge remote-tracking branch 'jekstrand/wip/i965-uniforms' into vulkanKristian Høgsberg Kristensen2016-01-2113-227/+303
|\ \
| * | i965/fs: Push small uniform arraysJason Ekstrand2015-12-141-23/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unfortunately, this also means that we need to use a slightly different algorithm for assign_constant_locations. The old algorithm worked based on the assumption that each read of a uniform value read exactly one float. If it encountered a MOV_INDIRECT, it would immediately bail and push the whole thing. Since we can now read ranges using MOV_INDIRECT, we need to be able to push a series of floats without breaking them up. To do this, we use an algorithm similar to the on in split_virtual_grfs.
| * | i965/fs: Rename demote_pull_constants to lower_constant_loadsJason Ekstrand2015-12-142-3/+3
| | |
| * | i965/vec4: Get rid of the uniform_size arrayJason Ekstrand2015-12-145-31/+0
| | |
| * | i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOADJason Ekstrand2015-12-141-1/+1
| | |
| * | i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constantsJason Ekstrand2015-12-144-51/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit moves us to an instruction based model rather than a register-based model for indirects. This is more accurate anyway as we have to emit instructions to resolve the reladdr. It's also a lot simpler because it gets rid of the recursive reladdr problem by design. One side-effect of this is that we need a whole new algorithm in move_uniform_array_access_to_pull_constants. This new algorithm is much more straightforward than the old one and is fairly similar to what we're already doing in the FS backend.
| * | i965/vec4: Inline get_pull_constant_offsetJason Ekstrand2015-12-142-25/+14
| | | | | | | | | | | | It's not really doing enough anymore to justify a helper function.
| * | i965/fs: Get rid of the param_size arrayJason Ekstrand2015-12-144-15/+0
| | |
| * | i965/fs: Stop relying on param_size in assign_constant_locationsJason Ekstrand2015-12-141-27/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have MOV_INDIRECT opcodes, we have all of the size information we need directly in the opcode. With a little restructuring of the algorithm used in assign_constant_locations we don't need param_size anymore. The big thing to watch out for now, however, is that you can have two ranges overlap where neither contains the other. In order to deal with this, we make the first pass just flag what needs pulling and handle assigning pull constant locations until later.
| * | i965/fs: Get rid of reladdrJason Ekstrand2015-12-142-10/+2
| | | | | | | | | | | | We aren't using it anymore.
| * | i965/fs: Use MOV_INDIRECT for all indirect uniform loadsJason Ekstrand2015-12-142-39/+87
| | | | | | | | | | | | | | | | | | | | | | | | Instead of using reladdr, this commit changes the FS backend to emit a MOV_INDIRECT whenever we need an indirect uniform load. We also have to rework some of the other bits of the backend to handle this new form of uniform load. The obvious change is that demote_pull_constants now acts more like a lowering pass when it hits a MOV_INDIRECT.
| * | i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECTJason Ekstrand2015-12-141-0/+45
| | |
| * | i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardwareJason Ekstrand2015-12-142-13/+42
| | | | | | | | | | | | | | | | | | | | | While we're at it, we also add support for the possibility that the indirect is, in fact, a constant. This shouldn't happen in the common case (if it does, that means NIR failed to constant-fold something), but it's possible so we should handle it.