aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* st/mesa: create temporary textures with the same nr_samples as sourceIlia Mirkin2015-10-291-2/+6
| | | | | | | | Not sure if this is actually reachable in practice (to have a complex copy with MS textures). Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl: add fragdata arrays to program resource listTapani Pälli2015-10-291-0/+29
| | | | | | | | | | | | | | | This makes sure that user is still able to query properties about variables that have gotten removed by opt_dead_builtin_varyings pass. Fixes following OpenGL ES 3.1 test: ES31-CTS.program_interface_query.output-layout No Piglit regressions. v2: cleanup, drop extra parenthesis (Topi) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Marta Lofstedt <[email protected]>
* mesa: add fragdata_arrays list to gl_shaderTapani Pälli2015-10-292-16/+27
| | | | | | | | | | | | | This is required to store information about fragdata arrays, currently these variables get lost and cannot be retrieved later in sensible way for program interface queries. List will be utilized by next patch. Patch also modifies opt_dead_builtin_varyings pass to build list when lowering fragdata arrays. This is identical approach as taken with packed varyings pass. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Marta Lofstedt <[email protected]>
* glsl: fix GL_BUFFER_DATA_SIZE value for shader storage blocks with unsize arraysSamuel Iglesias Gonsalvez2015-10-291-3/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | From ARB_program_interface_query: "For the property of BUFFER_DATA_SIZE, then the implementation-dependent minimum total buffer object size, in basic machine units, required to hold all active variables associated with an active uniform block, shader storage block, or atomic counter buffer is written to <params>. If the final member of an active shader storage block is array with no declared size, the minimum buffer size is computed assuming the array was declared as an array with one element." Fixes the following dEQP-GLES31 tests: dEQP-GLES31.functional.program_interface_query.shader_storage_block.buffer_data_size.named_block dEQP-GLES31.functional.program_interface_query.shader_storage_block.buffer_data_size.unnamed_block dEQP-GLES31.functional.program_interface_query.shader_storage_block.buffer_data_size.block_array v2: - Fix comment's indentation and explain that the parser already checked that unsized array is in last element of a shader storage block (Iago). - Add assert (Iago). Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* docs: Mark GL_ARB_fragment_layer_viewport as done on i965.Kenneth Graunke2015-10-281-1/+1
|
* i965: Implement ARB_fragment_layer_viewport.Kenneth Graunke2015-10-284-1/+46
| | | | | | | | | | | | | | Normally, we could read gl_Layer from bits 26:16 of R0.0. However, the specification requires that bogus out-of-range 32-bit values written by previous stages need to appear in the fragment shader as-written. Instead, we pass in the full 32-bit value from the VUE header as an extra flat-shaded varying. We have the SF override the value to 0 when the previous stage didn't actually write a value (it's actually defined to return 0). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Make calculate_attr_overrides return the URB read offset.Kenneth Graunke2015-10-284-10/+17
| | | | | | | | | | | | Traditionally, we've hardcoded "URB Entry Read Offset" to 1 (which represents 2 vec4 varying slots) to skip over the 8 DWord VUE header. In order to support ARB_fragment_layer_viewport, we'll need to read from that header. This patch adds the basic plumbing necessary to calculate a value dynamically and hook it up in the SBE packets. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* glsl: Mark gl_ViewportIndex and gl_Layer varyings as flat.Kenneth Graunke2015-10-281-12/+25
| | | | | | | | | Integer varyings need to be flat qualified - all others were already. I think we just missed this. Presumably some hardware passes this via sideband and ignores attribute interpolation, so no one has noticed. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Properly check for PAD in fragment shaders with > 16 varyings.Kenneth Graunke2015-10-281-4/+1
| | | | | | | | | | | | | | Commit 268008f98c3810b9f276df985dc93efc0c49f33e changed unused VUE map slots to be initialized with BRW_VARYING_SLOT_PAD, not COUNT. I missed updating this. It also means that commit message was wrong, as some code *did* rely slots being initialized to COUNT. This may fix a bug with SSO programs with > 16 FS input varyings. I think we probably just emitted extra pointless code, but probably didn't break anything. We might also just have no tests for that. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Update stale comment about unused VUE map slots.Kenneth Graunke2015-10-281-3/+1
| | | | | | | I changed this from COUNT to PAD in commit 268008f98c3810b9f276df985dc93ef. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* nv50/ir: adapt to new method for passing in cull/clip distance masksIlia Mirkin2015-10-294-14/+14
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: share shaders between contexts and build immediatelyIlia Mirkin2015-10-293-1/+7
| | | | | | | Avoid deferring building shaders until draw time, should hopefully reduce any stuttering, as well as enable shader-db style analysis. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: do upload-time fixups for interpolation parametersIlia Mirkin2015-10-2915-19/+239
| | | | | | | | | | | | | | | | | Unfortunately flatshading is an all-or-nothing proposition on nvc0, while GL 3.0 calls for the ability to selectively specify explicit interpolation parameters on gl_Color/gl_SecondaryColor which would override the flatshading setting. This allows us to fix up the interpolation settings after shader generation based on rasterizer settings. While we're at it, we can add support for dynamically forcing all (non-flat) shader inputs to be interpolated per-sample, which allows st/mesa to not generate variants for these. Fixes the remaining failing glsl-1.30/execution/interpolation piglits. Signed-off-by: Ilia Mirkin <[email protected]>
* nir: Copy "patch" flag from ir_variable to nir_variable.Kenneth Graunke2015-10-283-2/+5
| | | | | | | This was introduced in GLSL IR after NIR development had branched. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Add intrinsics for tessellation shader system values.Kenneth Graunke2015-10-282-7/+14
| | | | | | | | | | | | | | | | nir_intrinsic_load_patch_vertices_in corresponds to gl_PatchVerticesIn, a special input in both the TCS and TES stages. nir_intrinsic_load_tess_coord corresponds to gl_TessCoord, a special tessellation evaluation shader input. nir_intrinsic_load_tess_level_outer/inner correspond to the gl_TessLevelOuter[] and gl_TessLevelInner[] evaluation shader inputs, which we treat as system values because they're stored specially. (These intrinsics are only for the TES - the TCS uses output variables.) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Fix missing BRW_NEW_*_PROG_DATA flagging caused by cache reuse.Kenneth Graunke2015-10-282-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider the case of two nearly identical GLSL fragment shaders: out vec4 color; void main() { color = vec4(1); } and layout(early_fragment_tests) in; out vec4 color; void main() { color = vec4(1); } These shaders compile to the exact same assembly, but have distinct values for brw_wm_prog_data::early_fragment_tests. Since these are two independent GLSL shaders, they have different program keys - notably, brw_wm_prog_key::program_string_id differs. When uploading the second, brw_upload_cache will find an existing copy of the assembly in the cache BO, which means matching_data will be non-NULL. Although we create a second cache item (with the new key and prog_data), we set item->offset to the existing copy and avoid re-uploading duplicate assembly. However, brw_search_cache() would only flag BRW_NEW_*_PROG_DATA if item->offset differed from the supplied offset. With reuse, both programs have the same offset, but prog_data changed. We have to flag it, but failed to. To fix this, we simply need to check if the aux (prog_data) pointer changed. If either the assembly or the prog_data differs, flag it. This fixes a regression since 1bba29ed403e735ba0bf04ed8aa2e571884f, where Topi fixed brw_upload_cache() to actually reuse identical assembly. Prior to that, reuse basically never happened due to bugs. Unfortunately, this code apparently wasn't prepared to handle reuse! Fixes GPU hangs in Dolphin on Broadwell. Huge thanks to Pierre Bourdon and Ilia Mirkin for debugging this and helping track down the real issue. Cc: "11.0" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92623 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Tested-by: Pierre Bourdon <[email protected]>
* clover: fix building fix clang-3.8Laurent Carlier2015-10-291-1/+5
| | | | | | | | | https://bugs.freedesktop.org/show_bug.cgi?id=92705 v2.1: use Linker::Flags::None instead of 0 and emplace_back() Signed-off-by: Laurent Carlier <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* nv50: add ARB_copy_image supportIlia Mirkin2015-10-283-8/+12
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: add ARB_copy_image supportIlia Mirkin2015-10-283-8/+12
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: fix crash when nv50_miptree_from_handle failsJulien Isorce2015-10-281-1/+2
| | | | | Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* vbo: replace assertion with conditional in vbo_compute_max_verts()Brian Paul2015-10-281-1/+2
| | | | | | | | | With just the right sequence of per-vertex commands and state changes, it's possible for this assertion to fail (such as with viewperf11's lightwave-06-1 test). Instead of asserting, return 0 so that the caller knows the VBO is full and needs to be flushed. Reviewed-by: Charmaine Lee <[email protected]>
* mesa: minor formatting fix in get_tex_rgba_compressed()Brian Paul2015-10-281-2/+1
|
* st/mesa: implement ARB_copy_imageMarek Olšák2015-10-288-52/+618
| | | | | | I wonder if the craziness was worth it. Reviewed-by: Brian Paul <[email protected]>
* gallium: add PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATSMarek Olšák2015-10-2815-1/+17
| | | | | | For ARB_copy_image. Reviewed-by: Brian Paul <[email protected]>
* radeonsi: allow copying between compatible compressed and uncompressed formatsMarek Olšák2015-10-281-1/+1
| | | | | | | | which is where a block in src maps to a pixel in dst and vice versa. e.g. DXT1 <-> R32G32_UINT DXT5 <-> R32G32B32A32_UINT Reviewed-by: Michel Dänzer <[email protected]>
* mesa: set TargetIndex in VDPAURegister*SurfaceNV (v2)Marek Olšák2015-10-281-2/+3
| | | | | | | | | | | We initialized Target, but not TargetIndex. This is required since 7d7dd1871174905dfdd3ca874a09d9. v2: do it in the right place. Noticed by Brian Paul. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92645 Reviewed-by: Brian Paul <[email protected]>
* i965: remove unneeded src_reg copy in emit_shader_time_writeEmil Velikov2015-10-281-1/+1
| | | | | | | | The variable is already of type src_reg. creating a new instance only to destroy it seems unnecessary. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: remove cache_aux_free_func arrayEmil Velikov2015-10-282-12/+5
| | | | | | | | | | | | There is only one function that can be called, which is well known at compilation time. The abstraction used here seems unnecessary, so let's use a direct call to brw_stage_prog_data_free() when appropriate, cut down the size of struct brw_cache. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* main: fix GL_MAX_NUM_ACTIVE_VARIABLES value for shader storage blocksSamuel Iglesias Gonsalvez2015-10-281-1/+20
| | | | | | | | | | | | | | | | The maximum number of active variables for shader storage blocks should take into account the specific rules for shader storage blocks, i.e. for an active shader storage block member declared as an array, an entry will be generated only for the first array element, regardless of its type. Fixes 3 dEQP-GLES31.functional.* tests: dEQP-GLES31.functional.program_interface_query.shader_storage_block.active_variables.named_block dEQP-GLES31.functional.program_interface_query.shader_storage_block.active_variables.unnamed_block dEQP-GLES31.functional.program_interface_query.shader_storage_block.active_variables.block_array Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* st/vdpau: disable RefPicList for Vdpau HEVCBoyuan Zhang2015-10-271-0/+1
| | | | | | Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* st/va: add VAAPI HEVC decode supportBoyuan Zhang2015-10-275-2/+209
| | | | | | Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* radeon/uvd: implement and add flag for VAAPI HEVC decodeBoyuan Zhang2015-10-272-0/+16
| | | | | | Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* vl: add RefPicList defines for VAAPI HEVC decodeBoyuan Zhang2015-10-271-0/+2
| | | | | | Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* mesa: Draw indirect is not allowed if the default VAO is bound.Marta Lofstedt2015-10-271-0/+12
| | | | | | | | | | | From OpenGL ES 3.1 specification, section 10.5: "DrawArraysIndirect requires that all data sourced for the command, including the DrawArraysIndirectCommand structure, be in buffer objects, and may not be called when the default vertex array object is bound." Signed-off-by: Marta Lofstedt <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* winsys/amdgpu: remove the dcc_enable surface flagMarek Olšák2015-10-273-10/+7
| | | | | | dcc_size is sufficient and doesn't need a further comment in my opinion. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add debug flags that disable DCC and DCC fast clearMarek Olšák2015-10-273-0/+10
| | | | | | | For debugging, bug reports, etc. This is not in the radeonsi directory, but it is about radeonsi. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: properly check if DCC is enabled and allocatedMarek Olšák2015-10-275-8/+8
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: simplify DCC handling in si_initialize_color_surfaceMarek Olšák2015-10-271-7/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: Draw indirect is not allowed when xfb is active and unpausedMarta Lofstedt2015-10-271-0/+9
| | | | | | | | | | OpenGL ES 3.1 specification, section 10.5: "An INVALID_OPERATION error is generated if transform feedback is active and not paused." Signed-off-by: Marta Lofstedt <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* mesa: Draw Indirect return wrong error code on unalingedMarta Lofstedt2015-10-271-4/+6
| | | | | | | | | | | | | | | | | | | | From OpenGL 4.4 specification, section 10.4 and Open GL Es 3.1 section 10.5: "An INVALID_VALUE error is generated if indirect is not a multiple of the size, in basic machine units, of uint." However, the current code follow the ARB_draw_indirect: https://www.opengl.org/registry/specs/ARB/draw_indirect.txt "INVALID_OPERATION is generated by DrawArraysIndirect and DrawElementsIndirect if commands source data beyond the end of a buffer object or if <indirect> is not word aligned." V2: After discussions on the list, it was suggested to only keep the INVALID_VALUE error. Signed-off-by: Marta Lofstedt <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* main: Remove interface block array index for doing the name comparisonSamuel Iglesias Gonsalvez2015-10-271-1/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From ARB_program_query_interface spec: "uint GetProgramResourceIndex(uint program, enum programInterface, const char *name); [...] If <name> exactly matches the name string of one of the active resources for <programInterface>, the index of the matched resource is returned. Additionally, if <name> would exactly match the name string of an active resource if "[0]" were appended to <name>, the index of the matched resource is returned. [...]" "A string provided to GetProgramResourceLocation or GetProgramResourceLocationIndex is considered to match an active variable if: [...] * if the string identifies the base name of an active array, where the string would exactly match the name of the variable if the suffix "[0]" were appended to the string; [...] " Fixes the following two dEQP-GLES31 tests: dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array_single_element v2: - Add AoA support (Timothy) - Apply it too for GetUniformLocation(), GetUniformName() and others because ARB_program_interface_query says that they are equivalent to GetProgramResourceLocation() and GetProgramResourceName() (Tapani) Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* vc4: Add support for copy propagation with unpack flags present.Eric Anholt2015-10-262-36/+109
| | | | | total instructions in shared programs: 89251 -> 87862 (-1.56%) instructions in affected programs: 52971 -> 51582 (-2.62%)
* vc4: Rewrite the pack instructions as a MOV with a dst pack flagEric Anholt2015-10-263-37/+18
| | | | Another step in reducing the special-casing of instructions.
* vc4: Move dst pack setup out to a helper function with more asserts.Eric Anholt2015-10-261-10/+22
|
* vc4: Switch the unpack ops to being unpack flags on a mov.Eric Anholt2015-10-266-123/+42
| | | | | | | | | | | | This paves the way for copy propagating our unpacks. We end up with a small change on shader-db: total instructions in shared programs: 89390 -> 89251 (-0.16%) instructions in affected programs: 19041 -> 18902 (-0.73%) which appears to be because we no longer convert MOVs for an FMAX dst, r4.unpack, r4.unpack (instead of the previous MOV dst, r4.unpack), and this ends up with a slightly better schedule.
* vc4: Drop some confused code about pack/unpack handling.Eric Anholt2015-10-261-23/+4
| | | | | | | | | At one point I thought packs and unpacks were in the same field of the instruction. They aren't. These instructions therefore never cause a pack. total instructions in shared programs: 89472 -> 89390 (-0.09%) instructions in affected programs: 15261 -> 15179 (-0.54%)
* vc4: Reduce MOV special-casing in QIR-to-QPU.Eric Anholt2015-10-261-8/+11
| | | | | I'm going to introduce some more types of MOV, which also want the elision of raw MOVs.
* vc4: Fix up the test for whether the unpack can be from r4.Eric Anholt2015-10-263-8/+27
| | | | We can do 16a/16b from float as well. No difference on shader-db.
* vc4: Don't try to follow MOVs across a pack.Eric Anholt2015-10-261-1/+2
|
* vc4: Only copy propagate raw MOVs.Eric Anholt2015-10-261-6/+1
| | | | No problems being fixed, but needed for the new unpack changes.