summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* mesa: whitespace fixes in enable.cBrian Paul2017-06-161-68/+68
| | | | Remove trailing whitespace, replace tabs w/ spaces, etc. Trivial.
* i965: Convert SF_STATE to genxml.Rafael Antognolli2017-06-165-288/+83
| | | | | | | | This patch finishes the work done by Ken of converting SF_STATE to genxml, and merges it with gen6+ code for emitting that state. Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* genxml: The viewport state offset is actually an address.Rafael Antognolli2017-06-161-1/+1
| | | | | | | This fixes code generation on gen45. Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* genxml: Rename fields to match gen6+.Rafael Antognolli2017-06-163-3/+3
| | | | | | | "Anti-aliasing Enable" to "Anti-Aliasing Enable". Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* genxml: Rename SF_STATE field to match gen6+.Rafael Antognolli2017-06-163-9/+9
| | | | | | | | | Rename "Use Point Width State" to "Point Width Source". It accepts the same values and has the same meaning as gen6+, so lets keep them with the same name to simplify the code. Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: aa_line_distance_mode should be before the padding.Rafael Antognolli2017-06-161-1/+1
| | | | | | | It seems that it was never set correctly. Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* swr/rast: Fix read-back of viewport array indexTim Rowley2017-06-1610-117/+182
| | | | | | | Binner/clipper read viewport array index from the vertex header as needed. Move viewport state to BACKEND_STATE. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Refactor includes to limit simdintrin.h usageTim Rowley2017-06-1616-1079/+1147
| | | | | | | Reduces the files rebuilt after modifying simdintrin.h from 84 to 64. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix read-back of render target array indexTim Rowley2017-06-165-13/+18
| | | | | | | | The last FE stage can emit render target array index. Currently we only check to see if GS is emitting it. Moved the state to BACKEND_STATE and plumbed the driver to set it. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Adjust cast for gcc warningTim Rowley2017-06-161-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Don't transition hottile resolved->dirty during store tilesTim Rowley2017-06-161-1/+4
| | | | | | Fixes crash when dumping render targets and RT surface has been deleted. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: gen_llvm_types.py support for SIMD256/SIMD512Tim Rowley2017-06-161-6/+6
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Properly size GS stage scratch spaceTim Rowley2017-06-161-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix early z / query interactionTim Rowley2017-06-161-0/+4
| | | | | | | | | | | | | | | | For certain cases, we perform early z for optimization. The GL_SAMPLES_PASSED query was providing erroneous results because we were counting the number of samples passed before the fragment shader, which did not work if the fragment shader contained a discard. Account properly for discard and early z, by anding the zpass mask with the post fragment shader active mask, after the fragment shader. Fixes the following piglit tests: - occlusion-query-discard - occlusion_query_meta_fragments Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Share vertex memory between VS input/outputTim Rowley2017-06-161-5/+2
| | | | | | | | Removes large simdvertex stack allocation. Vertex shader must ensure reads happen before writes. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add support for dynamic vertex size for VS outputTim Rowley2017-06-163-15/+23
| | | | | | | | Add support for dynamic vertex size for the vertex shader output. Add new state in SWR_FRONTEND_STATE to specify the size. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: SIMD16 FE - improve calcDeterminantIntVerticalTim Rowley2017-06-161-12/+20
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add support to PA for variable sized verticesTim Rowley2017-06-164-26/+38
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Rework attribute layoutTim Rowley2017-06-164-66/+103
| | | | | | | Move fixed attributes to the top and pack single component SGVs. WIP to support dynamically allocated vertex size. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Remove explicit primitive id slot in the vertex layoutTim Rowley2017-06-167-58/+33
| | | | | | | | - Remove any special casing in the PS stage when primitive ID is input. Treat as a normal attribute that must be set up properly in the FE linkage. - Remove primitive id from the PS_CONTEXT and TRI_FLAGS Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix invalid 16-bit format traits for A1R5G5B5Tim Rowley2017-06-161-100/+48
| | | | | | | Correctly handle formats of <= 16 bits where the component bits don't add up to the pixel size. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Implement JIT shader caching to diskTim Rowley2017-06-1610-18/+358
| | | | | | Disabled by default; currently doesn't cache shaders (fs,gs,vs). Reviewed-by: Bruce Cherniak <[email protected]>
* gallium/docs: improve docs for SAMPLE_POS, SAMPLE_INFO, TXQS, MSAA semanticsBrian Paul2017-06-161-11/+47
| | | | | | | | | | | | | | | | | | | | | For the SAMPLE_POS and SAMPLE_INFO opcodes, clarify resource vs. render target queries, range of postion values, swizzling, etc. We basically follow the DX10.1 conventions. For the TXQS opcode and TGSI_SEMANTIC_SAMPLEID, clarify return value and type. For the TGSI_SEMANTIC_SAMPLEPOS system value, clarify the range of positions returned. v2: use 'undef' for unused vector components. Use (0.5, 0.5, undef, undef) for sample pos when MSAA not applicable. v3: Add note that OPCODE_SAMPLE_INFO, OPCODE_SAMPLE_POS are not used yet and the information is subject to change. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* svga: add some missing SVGA_STATS_* enum values, prefix stringsBrian Paul2017-06-161-2/+15
| | | | | | | To fix the build when VMX86_STATS is defined. Also, some minor whitespace changes to match upstream code. Reviewed-by: Charmaine Lee <[email protected]>
* swr: Don't crash when encountering a VBO with stride = 0.Bruce Cherniak2017-06-161-7/+18
| | | | | | | | | | | | | | | | | | | | The swr driver uses vertex_buffer->stride to determine the number of elements in a VBO. A recent change to the state-tracker made it possible for VBO's with stride=0. This resulted in a divide by zero crash in the driver. The solution is to use the pre-calculated vertex element stream_pitch in this case. This patch fixes the crash in a number of piglit and VTK tests introduced by 17f776c27be266f2. There are several VTK tests that still crash and need proper handling of vertex_buffer_index. This will come in a follow-on patch. v2: Correctly update all parameters for VBO constants (stride = 0). Also fixes the remaining crashes/regressions that v1 did not address, without touching vertex_buffer_index. Reviewed-by: Tim Rowley <[email protected]>
* intel/isl: Add the maximum surface size limitAnuj Phogat2017-06-161-0/+22
| | | | | | | | V2: Use 2^31 bytes (2GB) surface size limit on pre-gen9 and 2^38 bytes for gen9+. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Use uint64_t to store total surface sizeAnuj Phogat2017-06-162-2/+3
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* i965: Mark freshly allocate bo as idleChris Wilson2017-06-161-0/+1
| | | | | | | | | | When created, buffers are idle, so mark them as such to save an early ioctl or mistakenly assuming the fresh buffer is busy. Signed-off-by: Chris Wilson <[email protected]> Cc: Kenneth Graunke <[email protected]> Cc: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* etnaviv: add rs-operations sw queryChristian Gmeiner2017-06-165-0/+8
| | | | | | | | It could be useful to get the number of emited resolve operations when doing driver optimizations. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Lucas Stach <[email protected]>
* etnaviv: advertise correct max LOD biasLucas Stach2017-06-161-1/+3
| | | | | | | | | | | | The maximum LOD bias supported is the same as the max texture level supported. Fixes piglit: ext_texture_lod_bias Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs") Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: mask correct channel for RB swapped rendertargetsLucas Stach2017-06-163-13/+46
| | | | | | | | | | | | | Now that we support RB swapped targets by using a shader variant, we must derive the color mask from both the blend state and the bound framebuffer. Fixes piglit: fbo-colormask-formats Fixes: 7f62ffb68ad ("etnaviv: add support for rb swap") Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: replace translate_clear_color with util_pack_colorLucas Stach2017-06-162-48/+12
| | | | | | | | | | | | | | | This replaces the open coded etnaviv version of the color pack with the common util_pack_color. Fixes piglits: arb_color_buffer_float-clear fcc-front-buffer-distraction fbo-clearmipmap Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs") Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: remove bogus assertLucas Stach2017-06-161-2/+0
| | | | | | | | | | | etna_resource_copy_region handles resources with multiple samples by falling back to the software path. There is no need to kill the application there. Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs") Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: use padded width/height for resource copiesLucas Stach2017-06-161-2/+2
| | | | | | | | | | | | When copying a resource fully we can just blit the whole level. This allows to use the RS even for level sizes not aligned to the RS min alignment. This is especially useful, as etna_copy_resource is part of the software fallback paths (used in etna_transfer), that are used for doing unaligned copies. Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs") Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: don't try RS blit if blit region is unalignedLucas Stach2017-06-161-1/+2
| | | | | | | | | | If the blit region is not aligned to the RS min alignment don't try to execute the blit, but fall back to the software path. Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs") Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* Revert "amd/common: add missing libdrm include path"Emil Velikov2017-06-161-1/+0
| | | | | | | | This reverts commit 44b29dd7b6cdc1a3fde58c367b9de8081ac4167b. Should no longer be required as of last patch. Cc: Eric Engestrom <[email protected]>
* ac: remove amdgpu.h dependencyEmil Velikov2017-06-162-2/+6
| | | | | | | | | | | | | | | | Add a couple of forward declarations and drop the amdgpu.h requirement. With this we can build the r300 and r600 drivers without the need for amdgpu. v2: - Add amdgpu.h include in the C file (Marek) - Add a comment about pre C11 typedef redeclaration warning (Eric) Cc: Nicolai Hähnle <[email protected]> Cc: Marek Olšák <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101189 Signed-off-by: Emil Velikov <[email protected]>
* r600g,compute: provide local copy of functions from ac_binary.cJan Vesely2017-06-167-46/+199
| | | | | | | | | | | | | | This is a verbatim copy of the code. The functions can be cleaned up since r600 does not use all the stuff that gcn does. The symbol names have been changed since we still use ac_binary.h header (for struct definition) v2: Add ifdef guard around r600_binary_clean call (Aaron) Remove stray comment Signed-off-by: Jan Vesely <[email protected]> Tested-By: Aaron Watry <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* r600: android: amdgpu_common is only required when building OpenCLJan Vesely2017-06-161-5/+0
| | | | | | | v2: split off Android changes Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* egl/display: make platform detection thread-safeEric Engestrom2017-06-161-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | Imagine there are 2 threads that both call _eglGetNativePlatform() simultaneously: - thread 1 completes the first "if (native_platform == _EGL_INVALID_PLATFORM)" check and is preempted to do something else - thread 2 executes the whole function, does "native_platform = _EGL_NATIVE_PLATFORM" and just before returning it's preempted - thread 1 wakes up and calls _eglGetNativePlatformFromEnv() which returns _EGL_INVALID_PLATFORM because no env vars are set, updates native_platform and then gets preempted again - thread 2 wakes up and returns wrong _EGL_INVALID_PLATFORM Solve this by doing the detection in a local var and only overwriting the global one at the end, if no other thread has updated it since. This means the platform detected in the thread might not be the platform returned by the function, but this is a different issue that will need to be discussed when this becomes possible. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101252 Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]> Acked-by: Emil Velikov <[email protected]>
* egl/display: only detect the platform onceEric Engestrom2017-06-161-14/+17
| | | | | | | | | | | | | | My refactor missed the fact that `native_platform` is static. Add the proper guard around the detection code, as it might not be necessary, and only print the debug message when a detection was actually performed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101252 Fixes: 7adb9b094894a512c019 ("egl/display: remove unnecessary code and make it easier to read") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]> Acked-by: Emil Velikov <[email protected]>
* svga: Relax the format checks for copy_region_vgpu10 somewhatThomas Hellstrom2017-06-161-2/+26
| | | | | | | | | The new generic checks were actually more restrictive than the previous svga- specific tests and not vice versa. So bypass the common format checks for copy_region_vgpu10. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
* svga: Fix incorrect format conversion blit destinationThomas Hellstrom2017-06-161-1/+3
| | | | | | | | | | The blit.dst.resource member that was used as destination was modified earlier in the function, effectively making us try to blit the content onto itself. Fix this and also add a debug printout when the format conversion blits fail. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
* svga: Fix srgb copy_region regressionThomas Hellstrom2017-06-161-1/+4
| | | | | | | | This fixes a tf2 srgb copy_region regression from "svga: Rework the blit and resource_copy_region functionality v3" Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* svga: Prefer accelerated blits over cpu copy regionThomas Hellstrom2017-06-161-5/+3
| | | | | | | | | | | | | | | | | This reduces the number of cpu copy_region fallbacks on a Nvidia system running the piglit command ./publish/bin/piglit run -1 -t copy -t blit tests/quick from 64789 to 780 Previously this has caused a regression in piglit test spec@!opengl [email protected], but I'm currently not able to reproduce that regression. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Support accelerated conditional blittingThomas Hellstrom2017-06-164-43/+62
| | | | | | | | | | | | | | | | | | The blitter has functions to save and restore the conditional rendering state, but we currently don't save the needed info. Since also the copy_region_vgpu10 path supports conditional blitting, we instead use the same function as the clearing routines and move that function to svga_pipe_query.c Note that we still haven't implemented conditional blitting with the software fallbacks. Fixes piglit nv_conditional_render::copyteximage Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Use utility functions to help determine whether we can use copy_regionThomas Hellstrom2017-06-161-6/+3
| | | | | | | | | | | | | | It seems like the SVGA tests are in general more stringent than the utility tests, but they also miss some blitter features like filters and window rectangles, and if new blitter features are added in the future, it might be possible that we forget adding tests for those. So in addition to the SVGA tests, use the utility tests to restrict the situations where we can use copy_region. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Rework the blit and resource_copy_region functionality v3Thomas Hellstrom2017-06-161-201/+445
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This work was initially trigged by the fact that imported surfaces may be backed by other SVGA3D formats than the default. Therefore some fixes were needed to avoid using the copy_region_vgpu10() functionality for incompatible SVGA3D formats where the pipe formats were OK. This situation happens when using dri3. Also in some situations, for example where a R8G8_UNORM surface is backed by an SVGA3D_NV12 format, we can't use the copy_region functionality at all and thus need to fall back to the quad blitter also for the resource_copy_region function. This situation doesn't happen currently, but will if we start using video textures. The patch makes the blit- and copy_region paths similar and the decision whether to use a certain gpu command should now be easy to locate. Probably the resource_copy_region path will suffer from a minor additional cpu overhead, but on the other hand there are more cases now that we accelerate, since we try harder before falling back to cpu copies / blits. v2: Addressed review comments and fixed up piglit failures by sometimes preferring cpu_copy_region() over blit(). v3: Removed a stray test statement. Updated commit message. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* i965: Improve conditional rendering in fallback paths.Kenneth Graunke2017-06-153-47/+48
| | | | | | | | | | | | | | | | | | | | | | | | | We need to fall back in a couple of cases: - Sandybridge (it just doesn't do this in hardware) - Occlusion queries on Gen7-7.5 with command parser version < 2 - Transform feedback overflow queries on Gen7, or on Gen7.5 with command parser version < 7 In these cases, we printed a perf_debug message and fell back to _mesa_check_conditional_render(), which stalls until the full query result is available. Additionally, the code to handle this was a bit of a mess. We can do better by using our normal conditional rendering code, and setting a new state, BRW_PREDICATE_STATE_STALL_FOR_QUERY, when we would have set BRW_PREDICATE_STATE_USE_BIT. Only if that state is set do we perf_debug and potentially stall. This means we avoid stalls when we have a partial query result (i.e. we know it's > 0, but don't have the full value). The perf_debug should trigger less often as well. Still, this is primarily intended as a cleanup. Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: stop assigning unused storage for non-bindless opaque typesTimothy Arceri2017-06-161-36/+6
| | | | | | | | | | | | | | The storage was once used by get_sampler_uniform_value() but that was fixed long ago to use the uniform storage assigned by the linker. By not assigning storage for images/samplers the constant buffer for gallium drivers will be reduced which could result in small perf improvements. V2: rebase on ARB_bindless_texture Reviewed-by: Samuel Pitoiset <[email protected]>