summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nir/intrinsics: Add more atomic_counter opsIan Romanick2016-10-043-5/+110
| | | | | | | | | | v2: Delete some stray debug code notice by Iago. v3: Massive rebase on new ir_function_signature::intrinsic_id mechanism. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> [v1] Acked-by: Ilia Mirkin <[email protected]>
* nir/intrinsics: Include atomic_counter_ in the names used in macro invocationsIan Romanick2016-10-041-5/+5
| | | | | | | | | Otherwise grepping for where atomic_counter_inc and friends are defined is a very frustrating experience. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* glsl: Kill __intrinsic_atomic_subIan Romanick2016-10-043-19/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just generate an __intrinsic_atomic_add with a negated parameter. Some background on the non-obvious reasons for the the big change to builtin_builder::call()... this is cribbed from some discussion with Ilia on mesa-dev. Why change builtin_builder::call() to allow taking dereferences and create them here rather than just feeding in the ir_variables directly? The problem is the neg_data ir_variable node would have to be in two lists at the same time: the instruction stream and parameters. The ir_variable node is automatically added to the instruction stream by the call to make_temp. Restructuring the code so that the ir_variables could be in parameters then move them to the instruction stream would have been pretty terrible. ir_call in the instruction stream has an exec_list that contains ir_dereference_variable nodes. The builtin_builder::call method previously took an exec_list of ir_variables and created a list of ir_dereference_variable. All of the original users of that method wanted to make a function call using exactly the set of parameters passed to the built-in function (i.e., call __intrinsic_atomic_add using the parameters to atomicAdd). For these users, the list of ir_variables already existed: the list of parameters in the built-in function signature. This new caller doesn't do that. It wants to call a function with a parameter from the function and a value calculated in the function. So, I changed builtin_builder::call to take a list that could either be a list of ir_variable or a list of ir_dereference_variable. In the former case it behaves just as it previously did. In the latter case, it uses (and removes from the input list) the ir_dereference_variable nodes instead of creating new ones. text data bss dec hex filename 6036395 283160 28608 6348163 60dd83 lib64/i965_dri.so before 6036923 283160 28608 6348691 60df93 lib64/i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* glsl: Remove ir_function_signature::_is_intrinsic fieldIan Romanick2016-10-046-17/+5
| | | | | | | | | text data bss dec hex filename 6036491 283160 28608 6348259 60dde3 lib64/i965_dri.so before 6036395 283160 28608 6348163 60dd83 lib64/i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* glsl: Add ir_function_signature::is_intrinsic() methodIan Romanick2016-10-047-16/+22
| | | | | | | | | | | | This necessetated renaming the is_intrinsic field to _is_intrinsic. The next commit will remove the field. text data bss dec hex filename 6036507 283160 28608 6348275 60ddf3 lib64/i965_dri.so before 6036491 283160 28608 6348259 60dde3 lib64/i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* glsl: Use the ir_intrinsic_* enums instead of the __intrinsic_* name stringsIan Romanick2016-10-044-219/+276
| | | | | | | | | | | | | | text data bss dec hex filename 6038043 283160 28608 6349811 60e3f3 lib64/i965_dri.so before 6036507 283160 28608 6348275 60ddf3 lib64/i965_dri.so after v2: s/ir_intrinsic_atomic_sub/ir_intrinsic_atomic_counter_sub/. Noticed by Ilia. v3: Silence unhandled enum in switch warnings in st_glsl_to_tgsi. Signed-off-by: Ian Romanick <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* glsl: Track a unique intrinsic ID with each intrinsic functionIan Romanick2016-10-047-73/+280
| | | | | | | | | text data bss dec hex filename 6037483 283160 28608 6349251 60e1c3 lib64/i965_dri.so before 6038043 283160 28608 6349811 60e3f3 lib64/i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* glsl: Don't emit ir_binop_carry during ir_binop_imul_high loweringIan Romanick2016-10-041-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | st_glsl_to_tgsi only calls lower_instructions once (instead of in a loop), so the ir_binop_carry generated would not get lowered. Fixes assertion failure state_tracker/st_glsl_to_tgsi.cpp:2265: void glsl_to_tgsi_visitor::visit_expression(ir_expression*, st_src_reg*): Assertion `!"Invalid ir opcode in glsl_to_tgsi_visitor::visit()"' failed. on softpipe in 16 piglit tests: mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-only-msb-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-only-msb.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-only-msb-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-only-msb.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-only-msb-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-only-msb.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-only-msb-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-only-msb.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended.shader_test Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* i965: fix unused variable warning in brw_emit_gpgpu_walker()Timothy Arceri2016-10-051-2/+1
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: add MAYBE_UNUSED to assert paramTimothy Arceri2016-10-051-1/+1
| | | | | | Fixes unused variable warning in release build. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: wrap unused function in #ifndef NDEBUGTimothy Arceri2016-10-051-0/+2
| | | | | | | This function is only ever used by an assert() this fixes an unused function warning in release builds. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: fix unused variable warning in gen7_block_read_scratch()Timothy Arceri2016-10-051-2/+1
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: add MAYBE_UNUSED to assert paramTimothy Arceri2016-10-051-1/+1
| | | | | | This fixes an unused variable warning on release builds. Reviewed-by: Iago Toral Quiroga <[email protected]>
* gallivm: Use AVX2 gather instrinsics.Jose Fonseca2016-10-041-0/+95
| | | | | | v2: Use AVX2 gather for non aligned loads too. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use 8 wide AoS sampling on AVX2.Roland Scheidegger2016-10-041-5/+6
| | | | | | | | | | v2: Make sure that with num_lods > 1 and min_filter != mag_filter we still enter the splitting path. So this case would still use 4-wide aos path (as a side note, the 4-wide aos sampling path could actually be improved quite a bit if we have avx2, by just doing the filtering with 256bit vectors). Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Basic AVX2 support.José Fonseca2016-10-044-28/+98
| | | | | | v2: pblendb -> pblendvb Reviewed-by: Roland Scheidegger <[email protected]>
* egl: Drop duplicate check on EGLSync typeChad Versace2016-10-041-6/+0
| | | | | | | | | | _eglInitSync checked that the display supported the sync type (such as EGL_SYNC_FENCE), and did it wrong. When the check failed it emitted EGL_BAD_ATTRIBUTE, but sometimes EGL_BAD_PARAMETER is needed. _eglCreateSync already does the error checking, and it does it right. Reviewed-by: Emil Velikov <[email protected]>
* egl: Cleanup control flow in _eglParseSyncAttribListChad Versace2016-10-041-6/+8
| | | | | | | | | | When the function encountered an error, it effectively returned immediately. However, it did so indirectly by breaking out of a loop. Replace the loop breakout with a explicit 'return'. Do the same for _eglParseSyncAttribList64 too. Reviewed-by: Emil Velikov <[email protected]>
* egl: Add _eglConvertIntsToAttribs()Chad Versace2016-10-042-0/+43
| | | | | | | This function converts an attribute list from EGLint[] to EGLAttrib[]. Will be used in following patches to cleanup EGLSync attribute parsing. Reviewed-by: Emil Velikov <[email protected]>
* egl: Fix an error path in eglCreateSync*Chad Versace2016-10-041-2/+12
| | | | | | | | | | When the user called eglCreateSync64KHR on a display without EGL_KHR_cl_event2 (the only extension that exposes it), we returned EGL_NO_SYNC but did not update the error code. We also did the same for eglCreateSync on a display without EGL 1.5. Reviewed-by: Emil Velikov <[email protected]>
* egl: Fix truncation error in _eglParseSyncAttribList64Chad Versace2016-10-041-3/+4
| | | | | | | | The function stores EGLAttrib values in EGLint variables. On 64-bit systems, this truncated the values. Cc: [email protected] Reviewed-by: Emil Velikov <[email protected]>
* egl: Fix missing unlock in eglGetSyncAttribKHRChad Versace2016-10-041-1/+1
| | | | | | | | | | | On the error path, eglGetSyncAttribKHR neglected to unlock the EGLDisplay before returning. Fixes deadlock in dEQP-EGL.functional.fence_sync.invalid.get_invalid_value. Cc: [email protected] Cc: Mark Janes <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* anv/gen7_pipeline: Fix typo in semicolonAnuj Phogat2016-10-041-1/+1
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/gen7_pipeline: Set sample mask field in 3DSTATE_PSAnuj Phogat2016-10-041-0/+3
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/gen7_pipeline: Move ksp{1,2} state setting next to ksp0Anuj Phogat2016-10-041-3/+2
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/gen7: Make use of local variable prog_dataAnuj Phogat2016-10-041-2/+2
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/gen8_pipeline: Add an assert to ensure use_alt_mode is not set in prog_dataAnuj Phogat2016-10-041-0/+1
| | | | Signed-off-by: Anuj Phogat <[email protected]>
* anv/gen8_pipeline: Fix typo in semicolonAnuj Phogat2016-10-041-1/+1
| | | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Keep the value name 'Alternate' uniform across gen75.xmlAnuj Phogat2016-10-041-3/+3
| | | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Fix typo in gen75.xmlAnuj Phogat2016-10-041-1/+1
| | | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/gen8+: Enable GL_OES_viewport_arrayAnuj Phogat2016-10-042-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch causes 2 regressions in khronos' gles cts tests on various intel platforms. Failing tests: ES3-CTS.functional.state_query.integers.viewport_getinteger ES3-CTS.functional.state_query.integers.viewport_getfloat Here is an explanation of what's causing the failures: CTS tests are not clamping the x, y location of the viewport's bottom-left corner as recommended by ARB_viewport_array and OES_viewport_array: "The location of the viewport's bottom-left corner, given by (x,y), are clamped to be within the implementation-dependent viewport bounds range. The viewport bounds range [min, max] tuple may be determined by calling GetFloatv with the symbolic constant VIEWPORT_BOUNDS_RANGE_OES" Khronos CTS merge request to fix the test case: https://gitlab.khronos.org/opengl/cts/merge_requests/399 V2: Initialize the relevant variables for GL_OES_viewport_array on gen8+ Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: Add a check for OES_viewport_arrayAnuj Phogat2016-10-041-1/+3
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: Enable enums for OES_viewport_arrayAnuj Phogat2016-10-042-4/+10
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* anv/gen7_pipeline: Use MSDISPMODE_PERSAMPLE for non-multisampled fboAnuj Phogat2016-10-041-1/+2
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/blorp: Handle zero width/height blits in blorp_copy()Anuj Phogat2016-10-041-1/+4
| | | | | | | V2: Move the check from copy_buffer_to_image() to blorp_copy(). (Nanley) Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Add an assert to check zero width/height surfaceAnuj Phogat2016-10-041-0/+3
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* st/omx/dec/h265: add scaling list dataLeo Liu2016-10-041-5/+97
| | | | | | | | | Specified by subclause 7.3.4 v2: get the loop optimized Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]>
* st/omx/dec/h265: fix the skip for before and after listLeo Liu2016-10-041-3/+4
| | | | | | | | | | For reference picture sets, there are cases that rps will not always be used. Once detect the unused flag from encoded bitstream, we should not add this rps to any list, otherwise pass the incorrect reference and skip the correct rps. Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/omx/dec/h265: set the default reference picture set for referenceLeo Liu2016-10-041-2/+4
| | | | | | | | | | It will fix the corruption for frame, that only has one stort term ref picture set, we set NULL rps for this case previously, causing taking incorrect reference. Instead we should take that only short term set as reference Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/omx/dec/h265: decoder size should follow from spsLeo Liu2016-10-042-7/+8
| | | | | | | | | | | | | The video size from format container is not always compatible with the size from codec bitstream, the HW decoder should take the size information from bitstream, otherwise the corruption appears with clip that has different size info between bitstream and format container So we are passing width(height)_in_samples from sequence parameter set to video decoder. Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/omx/dec/h265: increase dpb max size to 32Leo Liu2016-10-041-1/+1
| | | | | | For clip with frame delta poc over 16 Signed-off-by: Leo Liu <[email protected]>
* nir/spirv: Remove a duplicate spirv2nir from .gitignoreEric Engestrom2016-10-041-1/+0
| | | | | | | | | | This reverts commit fc03ecfeaf5a10a8b84d366f24f02e74ab03b145. Chad had already pushed the same change between me posting the patch and Jason pushing it: 44bcf1ffcced04fd7f2b (".gitignore: Ignore src/compiler/spirv2nir") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radeonsi: optionally run the LLVM IR verifier passNicolai Hähnle2016-10-045-9/+38
| | | | | | | | This is enabled automatically if shader printing is enabled, or separately by R600_DEBUG=checkir. Catch mal-formed IR before it crashes in a later pass. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: fix argument type of llvm.{cttz,ctlz}.i32 intrinsicsNicolai Hähnle2016-10-041-2/+2
| | | | | | Caught by R600_DEBUG=checkir (next commit). Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: unify the creation of basic blocksNicolai Hähnle2016-10-041-10/+24
| | | | | | | This changes the order of basic blocks to be equal to the order of code in the original TGSI, which is nice for making sense of shader dumps. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: merge branch and loop flow control stacksNicolai Hähnle2016-10-042-82/+78
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: simplify if/else/endif blocksNicolai Hähnle2016-10-042-25/+18
| | | | | | | In particular, we no longer emit an else block when there is no ELSE instruction. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: label basic blocks by the corresponding TGSI pcNicolai Hähnle2016-10-041-0/+17
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: cleanup and fix branch emitsNicolai Hähnle2016-10-041-37/+14
| | | | | | | | | | | | | Some of the existing code is needlessly complicated. The basic principle should be: control-flow opcodes emit branches to properly terminate the current block, _unless_ the current block already has a terminator (which happens if and only if there was a BRK or CONT). This also fixes a bug where multiple terminators were created in a block. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97887 Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* winsys/radeon: add buffer_get_reloc_offsetNicolai Hähnle2016-10-044-2/+25
| | | | | | | | | | | Really fix the bug that was supposed to be fixed by commits 3e7cced4b and a48bf02d: even when virtual addresses are used, the legacy relocation-based method with offsets relative to the kernel's buffer object are used for video submissions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97969 Reviewed-by: Christian König <[email protected]> Reviewed-by: Marek Olšák <[email protected]>