summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* swr: [rasterizer common] icc declspec definitionsTim Rowley2016-07-201-1/+17
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] rework vertex/instance ID storage in fetchTim Rowley2016-07-202-64/+36
| | | | | | | | Moved the setting into the existing component control code. Fixes bad interaction between attribute/component setting for vertex/instance ID and component packing. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] avx512 simd utility workTim Rowley2016-07-204-10/+1026
| | | | | | Enabling KNOB_SIMD_WIDTH = 16 for AVX512 pre-work and low level simd utils Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] viewport rounding for disabled scissorTim Rowley2016-07-201-2/+4
| | | | | | | Adjust viewport rounding when scissor rect is disabled during macro tile scissor setup. Signed-off-by: Tim Rowley <[email protected]>
* i965: Stop muging cube array lengths by 6Jason Ekstrand2016-07-205-38/+11
| | | | | | | | | | | | | | | | | | | | | | | | From the Sky Lake PRM: "For SURFTYPE_CUBE: For Sampling Engine Surfaces and Typed Data Port Surfaces, the range of this field is [0,340], indicating the number of cube array elements (equal to the number of underlying 2D array elements divided by 6). For other surfaces, this field must be zero." In other words, the depth field for cube maps is in number of cubes not number of 2-D slices so we need to divide by 6. ISL will do this correctly for us assuming that we provide it with the correct array bounds which it expects to be in 2-D slices. It appears as if we've been doing this wrong ever since we first added cube map arrays for Sandy Bridge and the change to ISL made things slightly worse. While we're at it, we now need to remoe the shader hacks we've always done since they were only needed because we were setting the depth field six times too large. v2: Fix the vec4 backend as well (not sure how I missed this). Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/miptree: Set logical_depth0 == 6 for cube mapsJason Ekstrand2016-07-201-4/+11
| | | | | | | | | | | | | This matches what we do for cube maps where logical_depth0 is in number of face-layers rather than number of cubes. This does mean that we will temporarily be setting the surface bounds too loose for cube map textures but we are already setting them too loose for cube arrays and we will be fixing that in the next commit anyway. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Cc: "12.0 11.2 11.1" <[email protected]>
* i965/miptree: Enforce that height == 1 for 1-D array texturesJason Ekstrand2016-07-202-19/+5
| | | | | | | | | | | | | | | | | | | The GL API and mesa internals do this differently than we do. In GL, there is no depth parameter for 1-D arrays and height is used. In the i965 miptree code we do the sane thing and make height == 1 and use depth for number of slices. This makes for a mismatch every time we create a 1-D array texture from GL. Instead of actually solving this problem, we just said "1-D is hard, let's make sure it works no matter which way we pass the parameters" and called it a day. This commit fixes the one GL -> i965 transition point where we weren't already handling 1-D array textures to do the right thing and then replaces the magic fixup code with an assert that you're doing the right thing. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Cc: "12.0 11.2 11.1" <[email protected]>
* Avoid overflow in 'last' variable of FindGLXFunction(...)Stefan Dirsch2016-07-201-3/+3
| | | | | | | | | | | | | | This 'last' variable used in FindGLXFunction(...) may become negative, but has been defined as unsigned int resulting in an overflow, finally resulting in a segfault when accessing _glXDispatchTableStrings[...]. Fixed this by definining it as signed int. 'first' variable also needs to be defined as signed int. Otherwise condition for while loop fails due to C implicitly converting signed to unsigned values before comparison. Cc: <[email protected]> Signed-off-by: Stefan Dirsch <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* egl/android: Stop leaking DRI imagesTomasz Figa2016-07-201-0/+11
| | | | | | | | | | | | | | | | Current implementation of the DRI image loader does not free the images created in get_back_bo() and so leaks memory. Moreover, it creates a new image every time the DRI driver queries for buffers, even if the backing native buffer has not changed. leaking memory again. This patch adds missing call to destroyImage() in droid_enqueue_buffer() and a check if image is already created to get_back_bo() to fix the above. Cc: "11.2 12.0" <[email protected]> Signed-off-by: Tomasz Figa <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* egl/android: Add some useful error messagesTomasz Figa2016-07-201-3/+10
| | | | | | | | | | It is much easier to debug issues when the application gives some meaningful error messages. This patch adds few to the EGL Android platform backend. Signed-off-by: Tomasz Figa <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* egl/android: Check return value of dri2_get_dri_config()Tomasz Figa2016-07-201-0/+2
| | | | | | | | | It might return NULL if specific config variant is unsupported. Cc: "11.2 12.0" <[email protected]> Signed-off-by: Tomasz Figa <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965: store reference to the context within struct brw_fence (v2)Emil Velikov2016-07-201-11/+44
| | | | | | | | | | | | | | As the spec allows for {server,client}_wait_sync to be called without currently bound context, while our implementation requires context pointer. v2: Add a mutex and acquire it for the duration of brw_fence_client_wait() and brw_fence_is_completed() as suggested by Chad. Cc: "11.2 12.0" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Signed-off-by: Tomasz Figa <[email protected]>
* egl/dri2: dri2_make_current: Set EGL error if bindContext failsNicolas Boichat2016-07-201-2/+8
| | | | | | | | | | | | | Without this, if a configuration is, say, available only on GLES2/3, but not on GLES1, and is rejected by the dri module's bindContext call, eglMakeCurrent fails with error "EGL_SUCCESS". In this patch, we set error to EGL_BAD_MATCH, which is what CTS/dEQP dEQP-EGL.functional.surfaceless_context expect. Cc: "11.2 12.0" <[email protected]> Signed-off-by: Nicolas Boichat <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* egl/android: Remove unused variablesTomasz Figa2016-07-201-2/+0
| | | | | | | | | There are some unused variables left after previous clean-ups triggering compiler warnings. Let's remove them. Cc: "12.0" <[email protected]> Signed-off-by: Tomasz Figa <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium/dri: Add shared glapi to LIBADD on AndroidTomasz Figa2016-07-201-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | An earlier patch fixed the problem for classic drivers, however Gallium was still left broken. This patch applies the same workaround to Gallium, when compiled for Android. Following is a quote from the original patch: 0cbc90c57cfc mesa: dri: Add shared glapi to LIBADD on Android /system/vendor/lib/dri/*_dri.so actually depend on libglapi: without this, loading the so file fails with: cannot locate symbol "__emutls_v._glapi_tls_Context" On non-Android (non-bionic) platform, EGL uses the following workflow, which works fine: dlopen("libglapi.so", RTLD_LAZY | RTLD_GLOBAL); dlopen("dri/<driver>_dri.so", RTLD_NOW | RTLD_GLOBAL); However, bionic does not respect the RTLD_GLOBAL flag, and the dri library cannot find symbols in libglapi.so, so we need to link to libglapi.so explicitly. Android.mk already does this. Cc: "12.0" <[email protected]> Signed-off-by: Tomasz Figa <[email protected]> Signed-off-by: Nicolas Boichat <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: scons: remove left over src/glsl includeEmil Velikov2016-07-201-1/+0
| | | | | | The path no longer exists. Signed-off-by: Emil Velikov <[email protected]>
* mesa: scons: list builddir before srcdirEmil Velikov2016-07-201-4/+4
| | | | | | | | | | | Analogous to previous commit. Note: scons always uses OOT builds, while the in-tree generated files could be created either manually or by the autoconf build. Cc: "11.2 12.0" <[email protected]> Cc: Alexander von Gluck IV <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* mesa: automake: list builddir before srcdirEmil Velikov2016-07-201-3/+3
| | | | | | | | | | In the case of building in out-of-tree fashion, while having generated in-tree sources, the latter [likely stale] files will be used. Flip the order to prevent that. Cc: "11.2 12.0" <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* radeonsi: advertise 8 bits subpixel precision for viewport boundsJózef Kucia2016-07-201-1/+2
| | | | | Signed-off-by: Józef Kucia <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600: advertise 8 bits subpixel precision for viewport boundsJózef Kucia2016-07-201-1/+2
| | | | | Signed-off-by: Józef Kucia <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium: add a cap for VIEWPORT_SUBPIXEL_BITS (v2)Józef Kucia2016-07-2018-0/+21
| | | | | | | | | | | | This allows Gallium drivers to advertise the subpixel precision for floating point viewports bounds. v2: - Set ViewportSubpixelBits in st_init_limits. Signed-off-by: Józef Kucia <[email protected]> Signed-off-by: Marek Olšák <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: disable MS images on GM107+Samuel Pitoiset2016-07-201-0/+7
| | | | | | | | MS images have to be handled explicitly and I don't plan to implement them for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: print OP_SUREDB subops in debug modeSamuel Pitoiset2016-07-201-0/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gm107/ir: add emission for SUREDxSamuel Pitoiset2016-07-201-0/+50
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gm107/ir: add emission for SUSTx and SULDxSamuel Pitoiset2016-07-201-0/+104
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gm107/ra: fix constraints for surface operationsSamuel Pitoiset2016-07-201-2/+23
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gm107/ir: lower surface operationsSamuel Pitoiset2016-07-202-1/+77
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind images for 3d/cp shaders on GM107+Samuel Pitoiset2016-07-205-18/+207
| | | | | | | | | On Maxwell, images binding is slightly different (and much better) regarding Fermi and Kepler because a texture view needs to be uploaded for each image and this is going to simplify the thing a lot. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: increase the tex handles area size in the driver cbSamuel Pitoiset2016-07-201-11/+11
| | | | | | | | | | | | Currently, we can store 32 tex handles of 32-bits integer each and that fits perfectly with the underlying hardware except on GM107+ which requires to upload a texture view for each images. This patch increases the number of storable texture handles in the driver constant buffer from 32 to 40 because we expose 8 images. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nir: Fix uninitialized use of 'replacement'.Kenneth Graunke2016-07-191-1/+1
| | | | | | | | | | | For intrinsics we don't care about, just skip to the next loop iteration and process the next instruction. We don't want to execute the rest of the code. This was a bug in commit cdfc05ea6e8c87876cdbf588aa8e03d70f3da4bb. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Use tex_mocs instead of rb_mocs for GL images.Kenneth Graunke2016-07-191-1/+1
| | | | | | | | | | | | | | Fixes a 10-20% performance regression in OglCSDof caused by commit 5a8c89038abab0184ea72664ab390ec6ca58b4d6, which made images (in the image load/store sense) use BDW_MOCS_PTE instead of BDW_MOCS_WB. This seems sketchy, as the default PTE value is supposed to be WB LLC eLLC, which is the same as our MOCS WB setting. It's only supposed to change when using a surface for display, which won't ever happen for images. Something may be wrong in the kernel... Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* winsys/amdgpu: use pb_cache buckets for fewer pb_cache missesMarek Olšák2016-07-191-6/+21
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/radeon: use pb_cache buckets for fewer pb_cache missesMarek Olšák2016-07-191-7/+22
| | | | | | This makes Bioshock Infinite with deferred flushing 2.2% faster. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/pb_cache: reduce the number of pointer dereferencesMarek Olšák2016-07-191-7/+9
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/pb_cache: divide the cache into buckets for reducing cache missesMarek Olšák2016-07-195-26/+47
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/pb_cache: check parameters that are more likely to fail firstMarek Olšák2016-07-191-8/+7
| | | | | | This makes Bioshock Infinite with deferred flushing 2% faster. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: emit PS exports lastMarek Olšák2016-07-191-13/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This effectively removes s_waitcnt instructions after FP16 exports. Before: v_cvt_pkrtz_f16_f32_e32 v0, v0, v1 ; 5E000300 v_cvt_pkrtz_f16_f32_e32 v1, v2, v3 ; 5E020702 exp 15, 0, 1, 0, 0, v0, v1, v0, v0 ; F800040F 00000100 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e32 v0, v4, v5 ; 5E000B04 v_cvt_pkrtz_f16_f32_e32 v1, v6, v7 ; 5E020F06 exp 15, 1, 1, 0, 0, v0, v1, v0, v0 ; F800041F 00000100 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e32 v0, v8, v9 ; 5E001308 v_cvt_pkrtz_f16_f32_e32 v1, v10, v11 ; 5E02170A exp 15, 2, 1, 0, 0, v0, v1, v0, v0 ; F800042F 00000100 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e32 v0, v12, v13 ; 5E001B0C v_cvt_pkrtz_f16_f32_e32 v1, v14, v15 ; 5E021F0E exp 15, 3, 1, 1, 1, v0, v1, v0, v0 ; F8001C3F 00000100 s_endpgm ; BF810000 After: v_cvt_pkrtz_f16_f32_e32 v0, v0, v1 ; 5E000300 v_cvt_pkrtz_f16_f32_e32 v1, v2, v3 ; 5E020702 v_cvt_pkrtz_f16_f32_e32 v2, v4, v5 ; 5E040B04 v_cvt_pkrtz_f16_f32_e32 v3, v6, v7 ; 5E060F06 exp 15, 0, 1, 0, 0, v0, v1, v0, v0 ; F800040F 00000100 v_cvt_pkrtz_f16_f32_e32 v4, v8, v9 ; 5E081308 v_cvt_pkrtz_f16_f32_e32 v5, v10, v11 ; 5E0A170A exp 15, 1, 1, 0, 0, v2, v3, v0, v0 ; F800041F 00000302 v_cvt_pkrtz_f16_f32_e32 v6, v12, v13 ; 5E0C1B0C v_cvt_pkrtz_f16_f32_e32 v7, v14, v15 ; 5E0E1F0E exp 15, 2, 1, 0, 0, v4, v5, v0, v0 ; F800042F 00000504 exp 15, 3, 1, 1, 1, v6, v7, v0, v0 ; F8001C3F 00000706 s_endpgm ; BF810000 Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set optimal settings in COMPUTE_RESOURCE_LIMITSMarek Olšák2016-07-191-2/+6
| | | | | | ported from Vulkan Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: really wait for the second EOP event and not the first oneMarek Olšák2016-07-191-1/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove RADEON_FLUSH_KEEP_TILING_FLAGS flagMarek Olšák2016-07-195-16/+4
| | | | | | always set Reviewed-by: Nicolai Hähnle <[email protected]>
* nir/algebraic: Optimize fabs(u2f(x))Ian Romanick2016-07-191-0/+1
| | | | | | | | | | I noticed this when I tried to do frexp(float(some_unsigned)) in the ir_unop_find_lsb lowering pass. The code generated for frexp() uses fabs, and this resulted in an extra instruction. Ultimately I ended up not using frexp. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* st/mesa: Enable MESA_shader_integer_functions on all GLSL 1.30 platformsIan Romanick2016-07-192-1/+16
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Enable MESA_shader_integer_functions on all GLSL 1.30 platformsIan Romanick2016-07-192-6/+15
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Don't lower uaddCarry and usubBorrow in both GLSL IR and NIRIan Romanick2016-07-191-3/+1
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Update assertion to account for Gen < 7Ian Romanick2016-07-191-1/+4
| | | | | | | | | | | | Previously SHADER_OPCODE_MULH could only exist on Gen7+, so the assertion assumed the Gen7+ accumulator rules. A future patch will allow this instruction on at least Gen6, so update the assertion. v2: Use get_lowered_simd_width instead of open coding it. Suggested by Curro. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> [v1]
* i965: Use LZD to implement nir_op_find_lsb on Gen < 7Ian Romanick2016-07-192-3/+45
| | | | | | | v2: Rebase on changes to previous two patches. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use LZD to implement nir_op_ifind_msb on Gen < 7Ian Romanick2016-07-192-21/+90
| | | | | | | | | v2: Retype LZD source as UD to avoid potential problems with 0x80000000. Suggested by Matt. Also update comment about problem values with LZD(abs(x)). Suggested by Curro. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use LZD to implement nir_op_ufind_msbIan Romanick2016-07-194-1/+54
| | | | | | | | | | This uses one less instruction. v2: Move emit_find_msb_using_lzd out of the visitor classes. Suggested by Curro. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Always enable GL_ARB_shading_language_packingIan Romanick2016-07-191-1/+1
| | | | | | | | | With the existing lowering passes, the functions from this extension become a bunch of bit twiddling operations that have always been supported. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Move enable of EXT_shader_integer_mixIan Romanick2016-07-191-1/+2
| | | | | | | | This extension does not depend on the Gen. It only depends on the availability of GLSL 1.30. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>