summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965: Only enable oMask output when there's a multisample FBO.Kenneth Graunke2016-04-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ARB_sample_shading specification says that setting gl_SampleMask bits to 0 means that the corresponding sample "should be considered uncovered for the purposes of multisample fragment operations (Section 4.1.3)." The OpenGL 4.4 specification, section 17.3.3 ("Multisample Fragment Operations") specifies: "No changes to the fragment alpha or coverage values are made at this step if MULTISAMPLE is disabled, or if the value of SAMPLE_BUFFERS is not one." oMask output alters coverage masks and can kill pixels. We need to disable it in the above case, which conveniently corresponds to key->multisample_fbo being false. Khronos bug #12188 also spells this out clearly: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=12188 Fixes two Piglit tests: tests/spec/arb_sample_shading/builtin-gl-sample-mask-simple 0 tests/spec/arb_sample_shading/builtin-gl-sample-mask 0 Fixes 21 ES3 conformance tests: ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_0 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_1 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_7 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_4 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_5 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_7 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_4 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_6 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_0 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_5 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_7 Fixes 9 dEQP-GLES31.functional.shaders.sample_variables tests: sample_mask.discard_half_per_pixel.default_framebuffer sample_mask.discard_half_per_pixel.singlesample_rbo sample_mask.discard_half_per_pixel.singlesample_texture sample_mask.discard_half_per_sample.default_framebuffer sample_mask.discard_half_per_sample.singlesample_rbo sample_mask.discard_half_per_sample.singlesample_texture sample_mask.discard_half_per_two_samples.default_framebuffer sample_mask.discard_half_per_two_samples.singlesample_rbo sample_mask.discard_half_per_two_samples.singlesample_texture Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Generalize wm_key->compute_sample_id to wm_key->multisample_fbo.Kenneth Graunke2016-04-204-8/+7
| | | | | | | | | | | | | | | | I'm going to need a key entry meaning "we have a multisample FBO, and multisampling is enabled" in an upcoming patch. This is basically wm_key->compute_sample_id, except that it also checks that the SAMPLE_ID system value is read. The only use of wm_key->compute_sample_id is in emit_sampleid_setup(), which is only called when handling the SAMPLE_ID system value. So we can just eliminate the check and generalize the field. v2: Also update the Vulkan driver. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Delete now dead persample_2x FS program key flag.Kenneth Graunke2016-04-203-8/+0
| | | | | | | | | | This was only used by the old gl_SampleID calculations. The new code doesn't need to handle 2x specially. v2: Delete it from the Vulkan driver, too. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Simplify gl_SampleID setup on Gen8+.Kenneth Graunke2016-04-201-5/+37
| | | | | | | | | | | | | | | | | | On Gen7+, the thread payload provides the sample ID - we can read it in two instructions, without any elaborate calculations. We don't even need a state dependency - this will properly produce zero in the non-MSAA case. Unfortunately, we need the state flag anyway, so we may as well continue to use it to produce a single MOV 0 instead of SHR/AND. For some reason, the sample ID field is always zero on Gen7/7.5, so we can't use this yet. However, it works fine on Gen8+. So, land the code and use it where it's working, and leave a TODO for later. v2: Fix register types in the comment (caught by Matt Turner!). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Flip key->compute_sample_id check.Kenneth Graunke2016-04-201-7/+7
| | | | | | | This just moves the simple case first. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* st/mesa: Use correct size for compute CAPs.Bas Nieuwenhuizen2016-04-211-2/+6
| | | | | | | | | | Some CAPs are stored as 64-bit value while Mesa stores the related constant as 32-bit value. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* i965: Properly handle integer types in opt_vector_float().Kenneth Graunke2016-04-201-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, opt_vector_float() always interpreted MOV sources as floating point, and always created a MOV with a F-type destination. This meant that we could mess up sequences of integer loads, such as: mov vgrf6.0.x:D, 0D mov vgrf6.0.y:D, 1D mov vgrf6.0.z:D, 2D mov vgrf6.0.w:D, 3D Here, integer 0/1/2/3 become approximately 0.0f, so we generated: mov vgrf6.0:F, [0F, 0F, 0F, 0F] which is clearly wrong. We can properly handle this by converting integer values to float (rather than bitcasting), and emitting a type converting MOV: mov vgrf6.0:D, [0F, 1F, 2F, 3F] To do this, see first see if the integer values (converted to float) are representable. If so, we use a D-type MOV. If not, we then try the floating point values and an F-type MOV. We make zero not impose type restrictions. This is important because 0D would imply a D-type MOV, but is often used in sequences such as MOV 0D, MOV 0x3f800000D, where we want to use an F-type MOV. Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend. This recently became visible due to changes in opt_vector_float() which made it optimize more cases, but it was a pre-existing bug. Apparently it also manages to turn more integer loads into VFs, producing the following shader-db statistics on Haswell: total instructions in shared programs: 7084195 -> 7082191 (-0.03%) instructions in affected programs: 246027 -> 244023 (-0.81%) helped: 1937 total cycles in shared programs: 65669642 -> 65651968 (-0.03%) cycles in affected programs: 531064 -> 513390 (-3.33%) helped: 1177 v2: Handle the type of zero better. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Make opt_vector_float() only handle non-type-conversion MOVs.Kenneth Graunke2016-04-201-2/+5
| | | | | | | | | | | | | | | | We don't handle this properly - we'd have to perform the type conversion before trying to convert the value to a VF. While we could do that, it doesn't seem particularly useful - most vector loads should be consistently typed (all float or all integer). As a special case, we do allow type-converting MOVs of integer 0, as it's represented the same regardless of the type. I believe this case does actually come up. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Fold vectorize_mov() back into the one caller.Kenneth Graunke2016-04-202-28/+16
| | | | | | | | | | After the previous patch, this helper is only called in one place. So, just fold it back in - there are a lot of parameters here and not much code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Rework opt_vector_float() control flow.Kenneth Graunke2016-04-201-27/+34
| | | | | | | | | | | | | This reworks opt_vector_float() so that there's only one place that flushes out any accumulated state and emits a VF. v2: Don't break the sequence for non-representable numbers - just skip recording their values. Only break it for non-MOVs or register changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* anv: s/anv_batch_emit_blk/anv_batch_emit/Jason Ekstrand2016-04-2011-138/+133
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv: Remove the old emit macroJason Ekstrand2016-04-201-10/+0
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/gen7_pipeline: Use the new emit macroJason Ekstrand2016-04-201-113/+124
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/gen7_cmd_buffer: Use the new emit macroJason Ekstrand2016-04-201-52/+70
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/device: Use the new emit macroJason Ekstrand2016-04-202-12/+13
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/state: Use the new emit macroJason Ekstrand2016-04-201-77/+78
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/gen8_pipeline: Use the new emit macroJason Ekstrand2016-04-201-177/+191
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/genX_pipeline: Use the new emit macroJason Ekstrand2016-04-202-38/+45
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/gen8_cmd_buffer: Use the new emit macroJason Ekstrand2016-04-201-74/+87
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/cmd_buffer: Use the new emit macro for quariesJason Ekstrand2016-04-201-24/+32
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/cmd_buffer: Use the new emit macro for DRAWING_RECTANGLEJason Ekstrand2016-04-201-9/+10
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/cmd_buffer: Use the new emit macro for compute shader dispatchJason Ekstrand2016-04-201-52/+64
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/cmd_buffer: Use the new emit macro for 3DSTATE_CONSTANTJason Ekstrand2016-04-201-10/+11
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/cmd_buffer: Use the new emit macro for DEPTH/STENCIL_BUFFERJason Ekstrand2016-04-201-34/+42
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/cmd_buffer: Use the new emit macro for PIPE_CONTROL and STATE_BASE_ADDRESSJason Ekstrand2016-04-201-62/+76
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv/cmd_buffer: Use the new emit macro for 3DPRIMITIVE commandsJason Ekstrand2016-04-201-24/+28
| | | | Acked-by: Kristian Høgsberg <[email protected]>
* anv: Add a new block-based batch emit macroJason Ekstrand2016-04-201-0/+9
| | | | | | | | | This new macro uses a for loop to create an actual code block in which to place the macro setup code. One advantage of this is that you syntatically use braces instead of parentheses. Another is that the code in the block doesn't even get executed if anv_batch_emit_dwords fails. Acked-by: Kristian Høgsberg <[email protected]>
* gk110/ir: make use of IMUL32I for all immediatesSamuel Pitoiset2016-04-201-1/+1
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* gk110/ir: do not overwrite def value with zero for EXCH opsSamuel Pitoiset2016-04-201-6/+15
| | | | | | | | | | This is only valid for other atomic operations (including CAS). This fixes an invalid opcode error from dmesg. While we are it, make sure to initialize global addr to 0 for other atomic operations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* anv: fix build without Wayland platformMarcin Ślusarz2016-04-202-7/+5
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* anv: fix building on i686 with -mcpu=genericLaurent Carlier2016-04-201-1/+1
| | | | | | mcpu=generic doesn't enable sse2, and anvil definitly needs it Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Trivially handle the NonWriteable decorationJason Ekstrand2016-04-202-0/+4
| | | | Signed-off-by: Jason Ekstrand <[email protected]>
* nir: rename nir_foreach_block*() to nir_foreach_block*_call()Connor Abbott2016-04-2059-89/+92
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nvc0: avoid tex read fault from compute shaders on GK110Samuel Pitoiset2016-04-201-0/+3
| | | | | | | | | | | | After some investigation, it seems like that disabling the UNK02C4 command avoid a read fault with texelFetch() from a compute shader. I have no clue on what this method actually does, but this avoid the GPU to hang with basic-texelFetch.shader_test without introducing any compute-related regressions. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* i965/vec4: Always split uniforms in array_access_to_pull_constantsJason Ekstrand2016-04-201-1/+3
| | | | | | | | | | | | | | | | Normally, we split uniforms at the end but in Vulkan, we bail because we don't want pull constants. However, we still need them split because pack_uniforms relies on it. I really don't like this patch not because it doesn't work (it does) but because now that we're using MOV_INDIRECT, uniform numbers and sizes don't really matter anymore. In the FS backend, uniform splitting and packing is handled all at once (actual re-assignment of locations happens later) and we really should do it that way in vec4 eventually as well. Reviewed-by: Iago Toral Quiroga <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
* i965/vec4: Use the correct offset for the swizzle shift in push constantsJason Ekstrand2016-04-201-1/+1
| | | | | | | | | This was actually caught by Ken in review the first time around but somehow didn't get fixed before the patches were pushed. :-( Reviewed-by: Iago Toral Quiroga <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
* i965/vec4: Use nir_intrinsic_base in the load_uniform implementationJason Ekstrand2016-04-201-1/+1
| | | | | | | | We shouldn't be reading the const_index directly Reviewed-by: Iago Toral Quiroga <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
* anv/apply_dynamic_offsets: Provide a range on the load_uniformJason Ekstrand2016-04-201-1/+3
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
* anv/lower_push_constants: Stop treating scalar speciallyJason Ekstrand2016-04-203-28/+4
| | | | | | | | | | All of the code that did something special based on vec4 vs. scalar is bogus. In the backend, everything is now in units of bytes and the vec4 backend can handle full std140 packing so we don't need to do anything special anymore. Signed-off-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
* swr: fix resource backed constant buffersTim Rowley2016-04-202-7/+7
| | | | | | | | | | Code was using an incorrect address for the base pointer. v2: use swr_resource_data() utility function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94979 Reviewed-by: Bruce Cherniak <[email protected]> Tested-by: Markus Wick <[email protected]>
* nouveau: codegen: Add support for OpenCL global memory buffersHans de Goede2016-04-201-2/+10
| | | | | | | | | | | | | | | Add support for OpenCL global memory buffers, note this has only been tested with regular load and stores and likely needs more work for e.g. atomic ops. Tested with piglet on a gf119 and a gk107: ./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader [9/9] pass: 9 / ./piglit run -o shader -t '.*arb_compute_shader.*' results/shader [20/20] skip: 4, pass: 16 | Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nouveau: codegen: Use FILE_MEMORY_BUFFER for buffersHans de Goede2016-04-206-5/+13
| | | | | | | | | | | | | | | | | | | | | | Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for OpenCL global buffers. This commits changes the buffer code to use FILE_MEMORY_BUFFER at the ir_from_tgsi and lowering steps, freeing use of FILE_MEMORY_GLOBAL for use with OpenCL global buffers. Note that after lowering buffer accesses use the FILE_MEMORY_GLOBAL register file. Tested with piglet on a gf119 and a gk107: ./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader [9/9] pass: 9 / ./piglit run -o shader -t '.*arb_compute_shader.*' results/shader [20/20] skip: 4, pass: 16 | Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* scons: Build dri_common_interop.c.Jose Fonseca2016-04-201-0/+1
|
* st/dri: implement the GL interop DRI extension (v2.2)Marek Olšák2016-04-201-0/+258
| | | | | | | | | | v2: - set interop_version - simplify the offset_after macro v2.1: - use version numbers, remove offset_after - set "out_driver_data_written" v2.2: - set buf_offset & buf_size for GL_ARRAY_BUFFER too - add whandle.offset to buf_offset - disable the minmax cache for GL_TEXTURE_BUFFER
* glx: implement GLX part of interop interface (v2)Marek Olšák2016-04-208-6/+192
| | | | v2: - use const
* egl: implement EGL part of interop interface (v2)Marek Olšák2016-04-204-0/+114
| | | | v2: - use const
* st/dri: Fix RGB565 EGLImage creationNicolas Dufresne2016-04-201-20/+24
| | | | | | | | | | When creating egl images we do a bytes to pixel conversion by deviding by 4 regardless of the pixel format. This does not work for RGB565. In this patch, we avoid useless conversion and use proper API when the conversion cannot be avoided. Signed-off-by: Nicolas Dufresne <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* st/dri: Factor out DRI2 to PIPE_FORMAT conversionNicolas Dufresne2016-04-201-34/+27
| | | | | | | | This code is already duplicated twice and will be useful again. This will also help when adding formats. Signed-off-by: Nicolas Dufresne <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* freedreno/a4xx: lower srgb in shader for astc texturesRob Clark2016-04-197-6/+62
| | | | | | | | | | | | | | This *seems* like a hw bug, and maybe only applies to certain a4xx variants/revisions. But setting the SRGB bit in sampler view state (texconst0) causes invalid alpha for ASTC textures. Work around this by doing the srgb->linear conversion in the shader instead. This fixes 392 dEQP tests: dEQP-GLES3.functional.texture.*astc*srgb* (The remaining fails seem to be a bug w/ ASTC + linear filtering, also possibly a420.0 specific.) Signed-off-by: Rob Clark <[email protected]>
* nir/lower-tex: add srgb->linear loweringRob Clark2016-04-192-0/+53
| | | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>