summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* mesa/glspirv: pick off the only entry point we needIago Toral Quiroga2018-08-131-0/+15
| | | | | | | | | This is the same we do for vulkan drivers This is needed to pass the following CTS test: KHR-GL45.gl_spirv.spirv_modules_shader_binary_multiple_shader_objects_test Reviewed-by: Timothy Arceri <[email protected]>
* mesa/glspirv: compute double inputs and remap attributesAlejandro Piñeiro2018-08-131-0/+19
| | | | | | | | | | | | | | input locations used by input attributes are not handled in the same way in OpenGL vs Vulkan. There is a detailed explanation of such differences on the following commit: c2acf97fcc9b32eaa9778771282758e5652a8ad4 So with this commit, the same adjustment that is done after glsl_to_nir, is being done after spirv_to_nir, when it is used on OpenGL (ARB_gl_spirv). Reviewed-by: Timothy Arceri <[email protected]>
* nir/glsl: make nir_remap_attributes publicAlejandro Piñeiro2018-08-133-17/+27
| | | | | | As we plan to reuse it for ARB_gl_spirv implementation. Reviewed-by: Timothy Arceri <[email protected]>
* nir/lower_samplers: don't assume a deref for both texture and sampler srcsAlejandro Piñeiro2018-08-131-53/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit "nir: Use derefs in nir_lower_samplers" (75286c2d083cdbdfb202a93349e567df0441d5f7) assumes one deref for both the texture and the sampler. However there are cases (on OpenGL, using ARB_gl_spirv) where SPIR-V is not providing a sampler, like for texture query levels ops. Although we could make spirv_to_nir to provide a sampler deref for those cases, it is not really needed, and wrong from the Vulkan point of view. This patch fixes the following (borrowed) tests run on SPIR-V mode: arb_compute_shader/execution/basic-texelFetch.shader_test arb_gpu_shader5/execution/sampler_array_indexing/fs-simple-texture-size.shader_test arb_texture_query_levels/execution/fs-baselevel.shader_test arb_texture_query_levels/execution/fs-maxlevel.shader_test arb_texture_query_levels/execution/fs-miptree.shader_test arb_texture_query_levels/execution/fs-nomips.shader_test arb_texture_query_levels/execution/vs-baselevel.shader_test arb_texture_query_levels/execution/vs-maxlevel.shader_test arb_texture_query_levels/execution/vs-miptree.shader_test arb_texture_query_levels/execution/vs-nomips.shader_test glsl-1.30/execution/fs-textureSize-compare.shader_test v2: merge lower_tex_src_to_offset and calc_sampler_offsets together, update texture/sampler index and texture_array_size directly on lower_tex_src_to_offset (Jason) v3: clarify one comment (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir/linker: take into account hidden uniformsAlejandro Piñeiro2018-08-132-1/+8
| | | | | | | | | | | | | | | | | | | | | So they are not exposed through the introspection API. It is worth to note that the number of hidden uniforms of GLSL linking vs SPIR-V linking would be somewhat different due the differen order of the nir lowerings/optimizations. For example: gl_FbWposYTransform. This is introduced as part of nir_lower_wpos_ytransform. On GLSL that is executed after the IR-based linking. So that means that on GLSL the UniformStorage will not include this uniform. With the SPIR-V linking, that uniform is already present, but marked as hidden. So it will be included on the UniformStorage, but as hidden. One alternative would create a special how_declared for that case, but seemed an overkill. Using hidden should be ok as far as it is used properly. Reviewed-by: Timothy Arceri <[email protected]>
* nir: add how_declared to nir_variable.dataAlejandro Piñeiro2018-08-133-1/+26
| | | | | | | | | Equivalent to the already existing how_declared at GLSL IR. The only difference is that we are not adding all the declaration_type available on GLSL, only the one that we will use on the short term. We would add more mode if needed on the future. Reviewed-by: Timothy Arceri <[email protected]>
* spirv: Make VertexIndex and VertexId both non-zero-basedNeil Roberts2018-08-131-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | GLSL has gl_VertexID which is supposed to be non-zero-based. SPIR-V has both VertexIndex and VertexId builtins whose meanings are defined by the APIs. Vulkan defines VertexIndex as being non-zero-based. In Vulkan VertexId and InstanceId have no meaning and are pretty much just reserved for OpenGL at this point. GL_ARB_spirv removes VertexIndex and defines VertexId to be the same as gl_VertexId (which is also non-zero-based). Previously in Mesa it was treating VertexIndex as non-zero-based and VertexId as zero-based, so it was breaking for GL. This behaviour was apparently based on Khronos bug 14255. However that bug doesn’t seem to have made a final decision for VertexId. Assuming there really is no other definition for VertexId for Vulkan it seems better to just make them both have the same value. v2: update comment and commit descriptions, based on Jason Ekstrand explanation of the meaning/rationale behind all those builtins (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: fill info.gs.input_primitive tooAlejandro Piñeiro2018-08-131-0/+2
| | | | | | | | | | | info.gs.output_primitive was already being filled. Not sure why this is not needed on Vulkan, but we found to be needed for ARB_gl_spirv. Specifically, this is needed to get the following test passing: KHR-GL45.gl_spirv.spirv_validation_builtin_variable_decorations_test Reviewed-by: Timothy Arceri <[email protected]>
* i965: enable EXT_render_snormTapani Pälli2018-08-131-0/+1
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* mesa: enable EXT_render_snorm extensionTapani Pälli2018-08-136-9/+61
| | | | | | | | | | | | | | | | Patch sets additional formats renderable and enables the extension when OpenGL ES 3.1 is supported. v2: instead of dummy_true, have a separate toggle for extension (Eric Anholt) v3: add missing checks, simplify some existing checks and fix glCopyTexImage2D check (Nanley Chery) add SHORT and BYTE support in read_pixels_es3_error_check Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* blorp: Properly handle Z24X8 blits.Kenneth Graunke2018-08-112-12/+11
| | | | | | | | | | | | | | | | | | | One of the reasons we didn't notice that R24_UNORM_X8_TYPELESS destinations were broken was that an earlier layer was swapping it out for B8G8R8A8_UNORM. That made Z24X8 -> Z24X8 blits work. However, R32_FLOAT -> R24_UNORM_X8_TYPELESS was still totally broken. The old code only considered one format at a time, without thinking that format conversion may need to occur. This patch moves the translation out to a place where it can consider both formats. If both are Z24X8, we continue using B8G8R8A8_UNORM to avoid having to do shader math workarounds. If we have a Z24X8 destination, but a non-matching source, we use our shader hacks to actually render to it properly. Fixes: 804856fa5735164cc0733ad0ea62adad39b00ae2 (intel/blorp: Handle more exotic destination formats) Reviewed-by: Jason Ekstrand <[email protected]>
* blorp: Don't try to use R32_UNORM for R24_UNORM_X8_TYPELESS rendering.Kenneth Graunke2018-08-111-5/+5
| | | | | | | | | | | | | The hardware doesn't support rendering to R24_UNORM_X8_TYPELESS, so Jason decided to fake it with a bit of shader math and R32_UNORM RTs. The only problem is that R32_UNORM isn't renderable either...so we've just traded one bad format for another. This patch makes us use R32_UINT instead. Fixes: 804856fa5735164cc0733ad0ea62adad39b00ae2 (intel/blorp: Handle more exotic destination formats) Reviewed-by: Jason Ekstrand <[email protected]>
* intel: Switch the order of the 2x MSAA sample positionsJason Ekstrand2018-08-114-14/+24
| | | | | | | | The Vulkan 1.1.82 spec flipped the order to better match D3D. Cc: [email protected] Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* mesa/st/tests: Add array life range estimation and renumbering testsGert Wollny2018-08-111-0/+211
| | | | | Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/tests: Add array life range tests infrastructure to common test classGert Wollny2018-08-112-27/+186
| | | | | Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/glsl_to_tgsi: Expose array live range tracking and mergingGert Wollny2018-08-115-17/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch ties in the array split, merge, and interleave code. shader-db changes in the TGSI code are: original code | array-merge | change mean max | mean max | best mean % worst ----------------------------------------------------------- arrays 0.05 2 | 0.00 0 | -2 -100 0 total temps 5.05 21 | 4.92 20 | -15 -2.59 1 instr 55.33 988 | 55.20 988 | -15 -0.24 0 Evaluation: Run shader-db in single thread mode (otherwise the output is not ordered and the best and worst column don't make sense) to get results pre-stats.txt and post-stats.txt. Then using python pandas: import pandas as pd old_stats = pd.read_csv('pre-stats.txt') new_stats = pd.read_csv('post-stats.txt') omean = old_stats.mean() omax = old_stats.max() nmean = new_stats.mean() nmax = new_stats.max() delta = new_stats - old_stats pd.concat([omean, omax, nmean, nmax, delta.min(), delta.mean()/old_stats.mean()*100, delta.max()], axis=1, keys=['mean', 'max', 'mean', 'max', 'best', 'avg change %', 'worst']) v4: - Correct typo and add bugs that are fixed by this series. - Update stats and describe stats evaluation Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105371 https://bugs.freedesktop.org/show_bug.cgi?id=100200 Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/glsl_to_tgsi: add array life range evaluation into tracking codeGert Wollny2018-08-111-12/+50
| | | | | | | | v4: Also track the register given in inst->resource. (thanks: Benedikt Schemmer for testing the patches on radeonsi, which revealed that I was missing tracking this) Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/glsl_to_tgsi: add class for array access trackingGert Wollny2018-08-111-0/+102
| | | | | | | | | | | | | | | | | | Because of the indirect access it is impossible to obtain an accurate per component and array element tracking. Therefore, the tracking is simplified to only track whether any element was accessed, whether this happend conditionally in a loop. In addition, while tracking of temporaries requires a per-componet tracking that is later fused, for arrays only the components access mask is neede. The resulting tracking code and evaluation of the array live range is sufficiently different from the evaluation of the live range of temporaries to justify implementing this in a different class instead of adding more complexity to the already existing code for temporary life range evaluation. v4: Update commit message to make it clearer why this class is seperate from the tracking of temporaries. Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/glsl_to_tgsi: move evaluation of read mask up in the call hierarchyGert Wollny2018-08-111-7/+8
| | | | | | | | | In preparation of the array live range tracking the evaluation of the read mask is moved out the register live range tracking to the enclosing call of the generalized read access tracking. Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/glsl_to_tgsi: rename access_record to register_merge_record and some ↵Gert Wollny2018-08-112-32/+33
| | | | | | | | | | | more renames In preparartion of adding the tracking of the live range the classes that refer to temporary registers are renamed. Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/tests: Add tests for array merge helper classes.Gert Wollny2018-08-114-7/+787
| | | | | | | | | | v2: - Define tests also in the meson.build file. v4: - Check no-op mapping of all bits. - Convert tests to the new class layout used in the merge evaulation. - remove dependency on llvm in meson build (Thanks Dylan Baker for pointing out that this might not needed) Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/glsl_to_tgsi: Add array merge logicGert Wollny2018-08-112-2/+407
| | | | | | | v4: - Update the code to use the new merge logic. - Use a cleaner, class-based approach for the evaluation of merges. Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/glsl_to_tgsi: Add helper classes to apply array merging and interleavingGert Wollny2018-08-112-1/+164
| | | | | | | | | v4: - Remove logic for evaluation of swizzles and merges since this was moved to array_live_range. This class now only handles the actual remapping. Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/glsl_to_tgsi: Add helper class for array live range merging and ↵Gert Wollny2018-08-114-0/+319
| | | | | | | | | | | | | | | | | interleaving This class holds the array length, live range, and accessed components, and it implements the logic for evaluating how arrays are merged and interleaved. v4: - Add logic to evaluate merge and interleave of a pair of arrays to the class array_live_range. - document class - update commit message Thanks Nicolai Hähnle for the pointers given. Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/glsl_to_tgsi:rename lifetime to register_live_rangeGert Wollny2018-08-116-83/+90
| | | | | | | | | | | On one hand "live range" is the term used in the literature, and on the other hand a distinction is needed from the array live ranges. v4: Fix indentions and white spaces Reviewed-by: Nicolai Hähnle <[email protected]> (v3) Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/glsl_to_tgsi: Properly resolve life times simple if/else + use ↵Gert Wollny2018-08-112-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | constructs in constructs like below, currently the live range estimation extends the live range of t unecessarily to the whole loop because it was not detected that t is unconditional written and later read only in the "if (a)" scope. while (foo) { ... if (a) { ... if (b) t = ... else t = ... x = t; ... } ... } This patch adds a unit test for this case and corrects the minimal live range estimation accordingly. v4: update comments Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/glsl_to_tgsi: Split arrays whose elements are only accessed directlyGert Wollny2018-08-111-1/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Array whose elements are only accessed directly are replaced by the according number of temporary registers. By doing so the otherwise reserved register range becomes subject to further optimizations like copy propagation and register merging. Thanks to the resulting reduced register pressure this patch makes the piglits spec/glsl-1.50/execution - variable-indexing/vs-output-array-vec3-index-wr-before-gs geometry/max-input-components pass on r600 (barts) where they would fail before with a "GPR limit exceeded" error (even with the spilling that was recently added). v2: * rename method dissolve_arrays to split_arrays * unify the tracking and remapping methods for src and dst registers * also track access to arrays via reladdr* v3: * enable this optimization only if the driver requests register merge v4: * Correct comments * Also update inst->resource if it is an array element (thanks: Benedikt Schemmer for testing the patches on radeonsi, which revealed that I was missing tracking this) Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* mesa/st/glsl_to_tgsi: Add method to collect some TGSI statisticsGert Wollny2018-08-111-0/+68
| | | | | | | | | | | | | | | | | | | When mesa is compiled in debug mode then this adds the possibility to print out some statistics about the translated and optimized TGSI shaders to a file. The functionality is enabled by setting the environment variable GLSL_TO_TGSI_PRINT_STATS to the file name where the statistics should be collected. The file is opened in append mode so that statistics from various runs will be accumulated. v4: Make accress to log file thread save (thanks for pointing this out Nicolai Hähnle) Signed-off-by: Gert Wollny <[email protected]> Acked-by: Dave Airlie <[email protected]>
* Gallium/tgsi: Correct signdness of return value of bit operationsGert Wollny2018-08-111-3/+4
| | | | | | | | | | | | | The GLSL operations findLSB, findMSB, and countBits always return a signed integer type. Let TGSI reflect this. v2: Properly set values in infer_(src|dst)_type (Thanks Roland Schneidegger for pointing out problems with my 1st approach) v2: Set values in the common infer_type code path, and only add the correct source type for UMSB (Roland Schneidegger) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* meson: Build with Python 3Mathieu Bridon2018-08-1030-74/+74
| | | | | | | | | | | | Now that all the build scripts are compatible with both Python 2 and 3, we can flip the switch and tell Meson to use the latter. Since Meson already depends on Python 3 anyway, this means we don't need two different Python stacks to build Mesa. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* python: Rework bytes/unicode string handlingMathieu Bridon2018-08-101-10/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In both Python 2 and 3, opening a file without specifying the mode will open it for reading in text mode ('r'). On Python 2, the read() method of a file object opened in mode 'r' will return byte strings, while on Python 3 it will return unicode strings. Explicitly specifying the binary mode ('rb') then decoding the byte string means we always handle unicode strings on both Python 2 and 3. Which in turns means all re.match(line) will return unicode strings as well. If we also make expandCString return unicode strings, we don't need the call to the unicode() constructor any more. We were using the ugettext() method because it always returns unicode strings in Python 2, contrarily to the gettext() one which returns byte strings. The ugettext() method doesn't exist on Python 3, so we must use the right method on each version of Python. The last hurdles are that Python 3 doesn't let us concatenate unicode and byte strings directly, and that Python 2's stdout wants encoded byte strings while Python 3's want unicode strings. With these changes, the script gives the same output on both Python 2 and 3. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* python: Fix inequality comparisonsMathieu Bridon2018-08-103-0/+18
| | | | | | | | | | | | | | | | | | | On Python 3, executing `foo != bar` will first try to call foo.__ne__(bar), and fallback on the opposite result of foo.__eq__(bar). Python 2 does not do that. As a result, those __eq__ methods were never called, when we were testing for inequality. Expliclty adding the __ne__ methods fixes this issue, in a way that is compatible with both Python 2 and 3. However, this means the __eq__ methods are now called when testing for `foo != None`, so they need to be guarded correctly. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* mesa/st: ETC2 now uses R8G8B8A8_SRGB as fallbackGert Wollny2018-08-101-1/+1
| | | | | | | | | | | The check for ETC2 compatibility was not updated when the fallback format was changed. Fixes: 71867a0a61cea20bf3f6115692e70b0d60f0b70d st/mesa: Fall back to R8G8B8A8_SRGB for ETC2 Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* python: Simplify list sortingMathieu Bridon2018-08-091-4/+2
| | | | | | | | | Instead of copying the list, then sorting the copy in-place, we can just get a new sorted copy directly. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* python: Use key-functions when sorting containersMathieu Bridon2018-08-091-2/+3
| | | | | | | | | | | | | | | | | | | | | | In Python 2, the traditional way to sort containers was to use a comparison function (which returned either -1, 0 or 1 when passed two objects) and pass that as the "cmp" argument to the container's sort() method. Python 2.4 introduced key-functions, which instead only operate on a given item, and return a sorting key for this item. In general, this runs faster, because the cmp-function has to get run multiple times for each item of the container. Python 3 removed the cmp-function, enforcing usage of key-functions instead. This change makes the script compatible with Python 2 and Python 3. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* python: Better check for integer typesMathieu Bridon2018-08-092-5/+16
| | | | | | | | | | | | Python 3 lost the long type: now everything is an int, with the right size. This commit makes the script compatible with Python 2 (where we check for both int and long) and Python 3 (where we only check for int). Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* python: Do not mix bytes and unicode stringsMathieu Bridon2018-08-091-1/+10
| | | | | | | | | | | | Mixing the two is a long-standing recipe for errors in Python 2, so much so that Python 3 now completely separates them. This commit stops treating both as if they were the same, and in the process makes the script compatible with both Python 2 and 3. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* python: Explicitly use a listMathieu Bridon2018-08-091-2/+2
| | | | | | | | | | | | | | On Python 2, the builtin functions filter() returns a list. On Python 3, it returns an iterator. Since we want to use those objects in contexts where we need lists, we need to explicitly turn them into lists. This makes the code compatible with both Python 2 and Python 3. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* python: Use the right function for the jobMathieu Bridon2018-08-091-1/+1
| | | | | | | | | | The code was just reimplementing itertools.combinations_with_replacement in a less efficient way. This does change the order of the results slightly, but it should be ok. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* egl: Fix leak of X11 pixmaps backing pbuffers in DRI3.Eric Anholt2018-08-091-0/+5
| | | | | | | | | | This is basically copied from the DRI2 destroy path. Without this, Raspberry Pi would quickly run out of CMA during the EGL tests in the CTS due to all the pixmaps laying around. Fixes: f35198badeb9 ("egl/x11: Implement dri3 support with loader's dri3 helper") Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* intel: Fix SIMD16 unaligned payload GRF reads on Gen4-5.Kenneth Graunke2018-08-091-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the SIMD16 Gen4-5 fragment shader payload contains source depth (g2-3), destination stencil (g4), and destination depth (g5-6), the single register of stencil makes the destination depth unaligned. We were generating this instruction in the RT write payload setup: mov(16) m14<1>F g5<8,8,1>F { align1 compr }; which is illegal, instructions with a source region spanning more than one register need to be aligned to even registers. This is because the hardware implicitly does (nr | 1) instead of (nr + 1) when splitting the compressed instruction into two mov(8)'s. I believe this would cause the hardware to load g5 twice, replicating subspan 0-1's destination depth to subspan 2-3. This showed up as 2x2 artifact blocks in both TIS-100 and Reicast. Normally, we rely on the register allocator to even-align our virtual GRFs. But we don't control the payload, so we need to lower SIMD widths to make it work. To fix this, we teach lower_simd_width about the restriction, and then call it again after lower_load_payload (which is what generates the offending MOV). Fixes: 8aee87fe4cce0a883867df3546db0e0a36908086 (i965: Use SIMD16 instead of SIMD8 on Gen4 when possible.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107212 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=13728 Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Diego Viola <[email protected]>
* i965: Only enable depth IZ signals if there's an actual depthbuffer.Kenneth Graunke2018-08-091-3/+8
| | | | | | | | | | | | | According to the G45 PRM Volume 2 Page 265 we're supposed to only set these signals when there is an actual depth buffer. Note that we already do this for the stencil buffer by virtue of brw->stencil_enabled invoking _mesa_is_stencil_enabled(ctx) which checks whether the current drawbuffer's visual has stencil bits (which is updated based on what buffers are bound). We just need to do it for depth as well. Not observed to fix anything. Reviewed-by: Jason Ekstrand <[email protected]>
* glx: GLX_MESA_multithread_makecurrent is direct-onlyAdam Jackson2018-08-091-1/+1
| | | | | | | | | | | | This extension is not defined for indirect contexts. Marking it as "client only", as the old code did here, would make the extension available in indirect contexts, even though the server would certainly not have it in its extension list. Cc: <[email protected]> Signed-off-by: Adam Jackson <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* anv: set error in all failure pathsEric Engestrom2018-08-091-1/+3
| | | | | | | | Cc: Jason Ekstrand <[email protected]> Fixes: 5b196f39bddc689742d3 "anv/pipeline: Compile to NIR in compile_graphics" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/tools: add missing variable initialisationEric Engestrom2018-08-091-1/+1
| | | | | | Fixes: 6a60beba4089315685b8 "intel/tools: Add an error state to aub translator" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* drirc: Allow extension midshader for Metro Reduxvadym.shovkoplias2018-08-091-0/+4
| | | | | | | | | This fixes both Metro 2033 Redux and Metro Last Light Redux Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99730 Signed-off-by: Eero Tamminen <[email protected]> Signed-off-by: Vadym Shovkoplias <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl: handle error case with ast_post_inc, ast_post_decTapani Pälli2018-08-091-0/+5
| | | | | | | | | | Return ir_rvalue::error_value with ast_post_inc, ast_post_dec if parser error was emitted previously. This way process_array_size won't see bogus IR generated like with commit 9c676a64273. Signed-off-by: Tapani Pälli <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98699 Reviewed-by: Iago Toral Quiroga <[email protected]>
* vc4: Implement texture_subdata() to directly upload tiled data.Eric Anholt2018-08-081-1/+39
| | | | | | This avoids a memcpy into a temporary in the upload path. Improves x11perf -putimage100 performance by 12.1586% +/- 1.38155% (n=145)
* vc4: Handle partial loads/stores of tiled textures.Eric Anholt2018-08-083-60/+155
| | | | | | | | | | | | | | | | Previously, we would load out the tile-aligned area, update the raster copy, and store it back. This was a huge cost for XPutImage calls to the screen under glamor. Instead, implement a general load/store path that walks over the source x/y writing into the corresponding pixel of the destination (using clever math from https://fgiesen.wordpress.com/2011/01/17/texture-tiling-and-swizzling/). If things are aligned, we go through the previous utile-at-a-time loop. Improves x11perf -putimage10 performance by 139.777% +/- 2.83464% (n=5) Improves x11perf -putimage100 performance by 383.908% +/- 22.6297% (n=11) Improves x11perf -getimage10 performance by 2.75731% +/- 0.585054% (n=145)
* vc4: Compile the LT image helper per cpp we might load/store.Eric Anholt2018-08-081-2/+31
| | | | | | | | For the partial load/store support I'm about to add, we want the memcpy to be compiled out to a single load/store. This should also eliminate the calls to vc4_utile_width/height(). Improves x11perf -putimage100 performance by 3.76344% +/- 1.16978% (n=15)