summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nir/algebraic: Simplify comparison with sequential integers starting with 0Ian Romanick2019-02-151-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All of the affected shaders are Unreal4 demos. All Gen6+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15437170 -> 15437001 (<.01%) instructions in affected programs: 21536 -> 21367 (-0.78%) helped: 43 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 3.93 x̃: 4 helped stats (rel) min: 0.68% max: 1.01% x̄: 0.80% x̃: 0.80% 95% mean confidence interval for instructions value: -4.07 -3.79 95% mean confidence interval for instructions %-change: -0.83% -0.77% Instructions are helped. total cycles in shared programs: 383007896 -> 383007378 (<.01%) cycles in affected programs: 158640 -> 158122 (-0.33%) helped: 38 HURT: 4 helped stats (abs) min: 1 max: 48 x̄: 13.89 x̃: 6 helped stats (rel) min: 0.03% max: 1.01% x̄: 0.33% x̃: 0.19% HURT stats (abs) min: 2 max: 3 x̄: 2.50 x̃: 2 HURT stats (rel) min: 0.06% max: 0.09% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: -16.90 -7.77 95% mean confidence interval for cycles %-change: -0.39% -0.19% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8213746 -> 8213745 (<.01%) instructions in affected programs: 127 -> 126 (-0.79%) helped: 1 HURT: 0 total cycles in shared programs: 187734146 -> 187734144 (<.01%) cycles in affected programs: 2132 -> 2130 (-0.09%) helped: 1 HURT: 0 Reviewed-by: Jason Ekstrand <[email protected]>
* nir/algebraic: Convert some f2u to f2iIan Romanick2019-02-151-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Section 5.4.1 (Conversion and Scalar Constructors) of the GLSL 4.60 spec says: It is undefined to convert a negative floating-point value to an uint. Assuming that (uint)some_float behaves like (uint)(int)some_float allows some optimizations in the i965 backend to proceed. This basically undoes the small amount of damage done by "intel/compiler: Avoid propagating inequality cmods if types are different". v2: Replicate part of the commit message as a comment in the code. Suggested by Jason. shader-db results compairing *before* "intel/compiler: Avoid propagating inequality cmods if types are different" and after this commit: Skylake total cycles in shared programs: 383007996 -> 383007896 (<.01%) cycles in affected programs: 85208 -> 85108 (-0.12%) helped: 13 HURT: 8 helped stats (abs) min: 2 max: 26 x̄: 10.77 x̃: 6 helped stats (rel) min: 0.09% max: 0.65% x̄: 0.28% x̃: 0.14% HURT stats (abs) min: 2 max: 12 x̄: 5.00 x̃: 3 HURT stats (rel) min: 0.04% max: 0.32% x̄: 0.12% x̃: 0.07% 95% mean confidence interval for cycles value: -9.31 -0.21 95% mean confidence interval for cycles %-change: -0.24% <.01% Cycles are helped. Broadwell total cycles in shared programs: 415251194 -> 415251370 (<.01%) cycles in affected programs: 83750 -> 83926 (0.21%) helped: 7 HURT: 13 helped stats (abs) min: 10 max: 12 x̄: 11.43 x̃: 12 helped stats (rel) min: 0.30% max: 0.30% x̄: 0.30% x̃: 0.30% HURT stats (abs) min: 2 max: 36 x̄: 19.69 x̃: 22 HURT stats (rel) min: 0.05% max: 0.89% x̄: 0.44% x̃: 0.47% 95% mean confidence interval for cycles value: 0.76 16.84 95% mean confidence interval for cycles %-change: <.01% 0.37% Inconclusive result (%-change mean confidence interval includes 0). Haswell total instructions in shared programs: 13823885 -> 13823886 (<.01%) instructions in affected programs: 2249 -> 2250 (0.04%) helped: 0 HURT: 1 total cycles in shared programs: 390094243 -> 390094001 (<.01%) cycles in affected programs: 85640 -> 85398 (-0.28%) helped: 15 HURT: 6 helped stats (abs) min: 4 max: 26 x̄: 18.53 x̃: 18 helped stats (rel) min: 0.09% max: 0.66% x̄: 0.47% x̃: 0.42% HURT stats (abs) min: 2 max: 14 x̄: 6.00 x̃: 2 HURT stats (rel) min: 0.04% max: 0.37% x̄: 0.15% x̃: 0.04% 95% mean confidence interval for cycles value: -17.36 -5.69 95% mean confidence interval for cycles %-change: -0.44% -0.14% Cycles are helped. Ivy Bridge total cycles in shared programs: 180986448 -> 180986552 (<.01%) cycles in affected programs: 34835 -> 34939 (0.30%) helped: 0 HURT: 10 HURT stats (abs) min: 2 max: 18 x̄: 10.40 x̃: 10 HURT stats (rel) min: 0.06% max: 0.36% x̄: 0.28% x̃: 0.30% 95% mean confidence interval for cycles value: 4.67 16.13 95% mean confidence interval for cycles %-change: 0.20% 0.35% Cycles are HURT. Sandy Bridge total cycles in shared programs: 154603969 -> 154603970 (<.01%) cycles in affected programs: 171514 -> 171515 (<.01%) helped: 25 HURT: 14 helped stats (abs) min: 1 max: 4 x̄: 1.80 x̃: 1 helped stats (rel) min: 0.02% max: 0.10% x̄: 0.04% x̃: 0.04% HURT stats (abs) min: 1 max: 8 x̄: 3.29 x̃: 3 HURT stats (rel) min: 0.03% max: 0.28% x̄: 0.10% x̃: 0.11% 95% mean confidence interval for cycles value: -0.91 0.96 95% mean confidence interval for cycles %-change: -0.02% 0.04% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45. Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler/test: Add unit test for mismatched signedness comparisonMatt Turner2019-02-151-0/+32
| | | | | | | | | v2 (idr): Move adding the test to after adding the fix. Reordering the two commits prevents possible headaches for git-bisect with scripts that always do 'ninja check'. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404 Reviewed-by: Ian Romanick <[email protected]>
* intel/compiler: Avoid propagating inequality cmods if types are differentMatt Turner2019-02-151-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: Fix silly bug in logic. s/||/&&/ All but one of the affected shaders is in an Unreal4 demo. The other is in Tomb Raider. All of the cases that Ian investigated appear to be sequences like the following if (int(uint(some_float)) < 0) /* other relations too */ ... At least in Tomb Raider, it's not obvious that this sequence came from the original shader. In some of the Unreal demos, the shader contains code like if (int(uint(textureLod(...))) > 0) ... which explicitly generates the offending sequence. All Gen6+ platforms had similar results (Skylake shown): total instructions in shared programs: 15437170 -> 15437187 (<.01%) instructions in affected programs: 4492 -> 4509 (0.38%) helped: 0 HURT: 17 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.05% max: 0.73% x̄: 0.66% x̃: 0.73% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.57% 0.75% Instructions are HURT. total cycles in shared programs: 383007996 -> 383007992 (<.01%) cycles in affected programs: 20542 -> 20538 (-0.02%) helped: 6 HURT: 7 helped stats (abs) min: 2 max: 6 x̄: 5.33 x̃: 6 helped stats (rel) min: 0.11% max: 0.36% x̄: 0.32% x̃: 0.36% HURT stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.27% max: 0.27% x̄: 0.27% x̃: 0.27% 95% mean confidence interval for cycles value: -3.30 2.69 95% mean confidence interval for cycles %-change: -0.19% 0.19% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404 Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: [email protected] Tested-by: Danylo Piliaiev <[email protected]>
* intel/compiler/test: Set devinfo->gen = 7Matt Turner2019-02-151-1/+1
| | | | | | | We emit an FBL instruction which only exists since Gen7. This prevents the test from segfaulting when run with TEST_DEBUG=1. Reviewed-by: Ian Romanick <[email protected]>
* gallium/auxiliary/vl: Add video compositor compute shader renderJames Zhu2019-02-152-28/+83
| | | | | | | | Add compute shader initilization, assign and cleanup in vl_compositor API. Set video compositor compute shader render as default when pipe support it. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/auxiliary/vl: Add compute shader to support video compositor renderJames Zhu2019-02-155-0/+469
| | | | | | | Add compute shader to support video compositor render. Signed-off-by: James Zhu <[email protected]> Acked-by: Christian König <[email protected]>
* gallium/auxiliary/vl: Rename csc_matrix and increase its size.James Zhu2019-02-153-7/+7
| | | | | | | | Rename csc_matrix to shader_params, and increase shader_params size to store more constants for compute shader, Signed-off-by: James Zhu <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/auxiliary/vl: Split vl_compositor graphic shaders from vl_compositor APIJames Zhu2019-02-155-688/+821
| | | | | | | | Split vl_compositor graphic shaders from vl_compositor API in order to share vl_compositor API with vl_compositor compute shader later. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/auxiliary/vl: Move dirty define to header fileJames Zhu2019-02-152-9/+8
| | | | | | | Move dirty define to header file to share with compute shader. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Christian König <[email protected]>
* nir: remove jump from two merging jump-ending blocksJuan A. Suarez Romero2019-02-151-2/+19
| | | | | | | | | | | | | | | | | | In opt_peel_initial_if optimization, when moving the continue list to end of the continue block, before the jump, could happen that the continue list itself also ends with a jump. This would mean that we would have two jump instructions in a row: the first one from the continue list and the second one from the contine block. As inserting an instruction after a jump is not allowed (and it does not make sense, as it will not be executed), remove the jump from the continue block and keep the one from continue list, as it will be executed first. CC: Jason Ekstrand <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: move ALU instruction before the jump instructionJuan A. Suarez Romero2019-02-151-1/+1
| | | | | | | | | | | | opt_split_alu_of_phi moves ALU instruction to the end of continue block. But if the continue block ends with a jump instruction (an explicit "continue" instruction) then the ALU must be inserted before the jump, as it is illegal to add instructions after the jump. CC: Ian Romanick <[email protected]> Fixes: 0881e90c099 ("nir: Split ALU instructions in loops that read phis") Reviewed-by: Ian Romanick <[email protected]>
* mesa: INVALID_VALUE for wrong type or format in Clear*Buffer*DataAndres Gomez2019-02-151-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | Instead of generating a GL_INVALID_ENUM error when the type or format is incorrect while using glClear{Named}Buffer{Sub}Data, generate GL_INVALID_VALUE. From page 72 (page 94 of the PDF) of the OpenGL 4.6 spec: " An INVALID_VALUE error is generated if type is not one of the types in table 8.2. An INVALID_VALUE error is generated if format is not one of the formats in table 8.3." Fixes the following test: KHR-GL45.direct_state_access.buffers_errors v2: correct the doxygen documentation. Cc: Pi Tabred <[email protected]> Cc: Brian Paul <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* virgl: use virgl_transfer_inline_write even lessGurchetan Singh2019-02-151-1/+1
| | | | | | | | | | | | | | We've noticed the Team Fortress 2 engine seems to do many small calls to glSubData(..). Let's pick our heuristic based on the resource base width, not the size of a particular upload. This will cause transfers to be batched together in the transfer queue. Revelant glbench microbenchmark -- Before: buffer_upload_dynamic_element_array_131072 = 131.17 mbytes_sec After: buffer_upload_dynamic_element_array_131072 = 6828.24 mbytes_sec Reviewed-by: Gert Wollny <[email protected]>
* virgl: use transfer queueGurchetan Singh2019-02-155-18/+36
| | | | | | | | | | This improves Unigine Valley benchmark by 3 to 10 fps (depending on the scene). It also improves the Team Fortress 2 benchmark from 6 fps to 13 fps (host: 20 fps). Reviewed-by: Gert Wollny <[email protected]>
* virgl: introduce transfer queueGurchetan Singh2019-02-155-0/+390
| | | | | | | | | | | | | | | | | | | | Transfers will be placed here at unmap time instead of incurring a VM exit. There's an attempt to deduplicate intersecting 1D transfers, which are surprisingly common. This can also help with mipmapped texture upload and smaller textures, where the majority of the time is spent in the guest kernel / QEMU -- not virglrenderer. This is shown by the GLbench texture upload benchmark: Before: texture_upload_rgba_teximage2d_32 = 64.23 mtexel_sec After: texture_upload_rgba_teximage2d_32 = 367.44 mtexel_sec v2: Split up list iteration functions (@gerddie) v3: Support for optimizing glBufferSubData Reviewed-by: Gert Wollny <[email protected]>
* virgl: add encoder functions for new protocolGurchetan Singh2019-02-152-0/+28
| | | | | | Let's encode the new protocol with new helper functions. Reviewed-by: Gert Wollny <[email protected]>
* virgl: make winsys modifications for encoded transfersGurchetan Singh2019-02-155-6/+21
| | | | | | | | | | | | | The idea is to have two command buffers: 1) One for transfers 2) One for commands, which can include transfers At flush time, (2) will be filled. Otherwise, (1) will be used to submit transfers if there are enough of them. v2: Pass size directly to cmd_buf_create (@gerddie) Reviewed-by: Gert Wollny <[email protected]>
* virgl: add extra checks in virgl_res_needs_flush_waitGurchetan Singh2019-02-151-4/+9
| | | | | | | | | | | | | | | | | | | | | | This is motivated by the following scenario: glSubBufferData(GL_ARRAY_BUFFER, ...) glFlush(..) glSubBufferData(GL_ARRAY_BUFFER, ...) glSubBufferData(GL_ARRAY_BUFFER, ...) glSubBufferData(GL_ARRAY_BUFFER, ...) This increases @davidriley's Team Fortress 2 apitrace from 1 fps to 6 fps and helps with the Chromium glbench microbenchmarks: Before: texture_update_rgba_texsubimage2d_2048 = 554.96 mtexel_sec buffer_upload_dynamic_array_12 = 0.02 mbytes_sec buffer_upload_dynamic_array_576 = 1.07 mbytes_sec After: texture_update_rgba_texsubimage2d_2048 = 612.29 mtexel_sec buffer_upload_dynamic_array_12 = 2.22 mbytes_sec buffer_upload_dynamic_array_576 = 164.89 mbytes_sec Reviewed-by: Gert Wollny <[email protected]>
* virgl: pass virgl transfer to virgl_res_needs_flush_waitGurchetan Singh2019-02-155-14/+22
| | | | Reviewed-by: Gert Wollny <[email protected]>
* virgl: keep track of number of computationsGurchetan Singh2019-02-152-3/+3
| | | | | | It's good to keep track of these things. Reviewed-by: Gert Wollny <[email protected]>
* virgl: limit command length to 16 bitsGurchetan Singh2019-02-152-5/+8
| | | | | | | | | | | Much of our logic is based around the idea the upper 16 bits of a command dword can encode the length of the command. Now that the command buffer >= 2^16 - 1, we should check for this. v2: alignment, and only check VIRGL_ENCODE_MAX_DWORDS Reviewed-by: Gert Wollny <[email protected]>
* virgl: use virgl_transfer in inline writeGurchetan Singh2019-02-151-26/+40
| | | | | | | | | | | | | Let's define a helper function and use it. This commit also allows resources to be emitted into different command buffers. Like the ioctls, send 0 for layer_stride and stride. If we actually send the real values, there are various assumptions in virglrenderer for non-1D buffers that may need to be modified. Reviewed-by: Gert Wollny <[email protected]>
* virgl: add protocol for resource transfersGurchetan Singh2019-02-152-0/+12
| | | | | | | | | Mostly similar to VIRGL_CCMD_RESOURCE_INLINE_WRITE. However, this uses the resource's already attached iovecs rather than the command buffer to transfer the data. v2: Used (1 << 16) not (1 << 15) [@gerddie] Reviewed-by: Gert Wollny <[email protected]>
* virgl: when creating / freeing transfers, pass slab pool directlyGurchetan Singh2019-02-154-14/+14
| | | | | | | This will allow us to destroy transfers w/o having a pointer to the context. Reviewed-by: Gert Wollny <[email protected]>
* virgl: unmap uploader at flush timeGurchetan Singh2019-02-151-2/+3
| | | | | | This should save some memory when allocating and freeing transfers. Reviewed-by: Gert Wollny <[email protected]>
* virgl: make alignment smaller when uploading index user buffersGurchetan Singh2019-02-151-1/+1
| | | | | | | | Since we're just uploading to guest memory, let's just align to dword size. Fixes: e0f932 ("u_upload_mgr: pass alignment to u_upload_data manually") Reviewed-by: Gert Wollny <[email protected]>
* virgl: track level cleanliness rather than resource cleanlinessGurchetan Singh2019-02-156-14/+20
| | | | | | This allows a minor optimization for texture upload. Reviewed-by: Gert Wollny <[email protected]>
* virgl: don't mark unclean after a flushGurchetan Singh2019-02-151-1/+0
| | | | | | | The guest memory is still clean until host GL touches it, which we should track elsewhere. Reviewed-by: Gert Wollny <[email protected]>
* virgl: use virgl_resource_dirty helperGurchetan Singh2019-02-156-16/+19
| | | | Reviewed-by: Gert Wollny <[email protected]>
* virgl: add ability to do finer grain dirty trackingGurchetan Singh2019-02-158-13/+15
| | | | | | There are levels to cleanliness. Reviewed-by: Gert Wollny <[email protected]>
* panfrost: Improve logging and patch memory leaksAlyssa Rosenzweig2019-02-152-49/+48
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Don't align framebuffer dimsAlyssa Rosenzweig2019-02-151-2/+2
| | | | | | Fixes regressions with EGL clients Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement PIPE_QUERY_OCCLUSION_COUNTERAlyssa Rosenzweig2019-02-151-1/+8
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Identify MALI_OCCLUSION_PRECISE bitAlyssa Rosenzweig2019-02-152-5/+7
| | | | | | Setting this is required for desktop-style occlusion queries. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* drirc/i965: add option to disable 565 configs and visualsTapani Pälli2019-02-152-0/+18
| | | | | | | | | | | We have cases where we would not like to expose these. v2: call the option allow_rgb565_configs for consistency with existing allow_rgb10_configs (Eric, Jason) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* panfrost: Backport driver to Mali T600/T700Alyssa Rosenzweig2019-02-156-262/+340
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There are a few differenes between Mali T860 (Panfrost's primary reference target) and the older Midgard generations (T600/T700): - Miscellaneous different magic numbers. It's not clear what these numbers mean on either the old or new configurations yet. - Errata fixes. T800 is the final Midgard generation and presumably the least buggy. Older Midgard has some extra hardware errata we have to workaround. - SFBD vs MFBD split. Essentially, older Midgard use a Single FrameBuffer Descriptor (SFBD), which corresponds to single render-target rendering. Newer Midgard (T760+) use a Multiple FrameBuffer Descriptor (MFBD), allowing multiple RTs. On ES 2.0, these descriptors serve the same function, but we implement both, depending on the version of the hardware. - CPU bitness. 32-bit systems generally use 32-bit GPU descriptors, and vice versa for 64-bit. Our target T760 systems are 32-bit whereas our target T860 systems are 64-bit. More work is needed in this area. This patch fixes support in these areas for supporting older Midgard hardware. It is tested on Mali T760 and Mali T860. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix build; depend on libdrmAlyssa Rosenzweig2019-02-151-0/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* nir/dead_cf: Stop relying on liveness analysisJason Ekstrand2019-02-141-21/+39
| | | | | | | | | | | | | The liveness analysis pass is fairly expensive because it has to build large bit-sets and run a fix-point algorithm on them. Instead of requiring liveness for detecting if values escape a CF node, just take advantage of the structured nature of NIR and use block indices instead. This only requires the block index metadata which is the fastest we have metadata to generate. No shader-db changes on Kaby Lake Reviewed-by: Timothy Arceri <[email protected]>
* nir/dead_cf: Inline cf_node_has_side_effectsJason Ekstrand2019-02-141-41/+32
| | | | | | | | We want to handle live SSA values differently and it's going to involve walking the instructions. We can make it a single instruction walk if we combine it with cf_node_has_side_effects. Reviewed-by: Timothy Arceri <[email protected]>
* intel/fs: Bail in optimize_extract_to_float if we have modifiersJason Ekstrand2019-02-141-0/+9
| | | | | | | | | | | | This fixes a bug in runscape where we were optimizing x >> 16 to an extract and then negating and converting to float. The NIR to fs pass was dropping the negate on the floor breaking a geometry shader and causing it to render nothing. Fixes: 1f862e923cb "i965/fs: Optimize float conversions of byte/word..." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109601 Tested-by: Lionel Landwerlin <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* swr: set PIPE_CAP_MAX_VARYINGS correctlyIlia Mirkin2019-02-141-0/+2
| | | | | | | | | | | Unfortunately swr was missed in the original commit. The number of varyings should generally match up to what's reported as the shader caps for fragment inputs. Fixes: 6010d7b8e8be (gallium: add PIPE_CAP_MAX_VARYINGS) Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Alok Hota <[email protected]> Cc: 19.0 <[email protected]>
* intel/fs: Silence a compiler warningJason Ekstrand2019-02-141-2/+1
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Silence some compiler warnings in release buildsJason Ekstrand2019-02-142-4/+4
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv/blorp: Delete a pointless assertJason Ekstrand2019-02-141-5/+0
| | | | | | | | Just a little higher up in the function we assert that the aspect masks are actually equal so there's no reason for the weaker check. Also, the temporary variables were causing compiler warnings in release builds. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Silence a couple of warnings in release buildsJason Ekstrand2019-02-142-1/+3
| | | | | | | | | | | | | | [28/716] Compiling C object 'src/compiler/nir/068b2c8@@nir@sta/nir_gather_xfb_info.c.o'. ../src/compiler/nir/nir_gather_xfb_info.c: In function ‘nir_gather_xfb_info’: ../src/compiler/nir/nir_gather_xfb_info.c:171:13: warning: variable ‘max_offset’ set but not used [-Wunused-but-set-variable] unsigned max_offset[NIR_MAX_XFB_BUFFERS] = {0}; ^~~~~~~~~~ [36/716] Compiling C object 'src/compiler/nir/068b2c8@@nir@sta/nir_instr_set.c.o'. ../src/compiler/nir/nir_instr_set.c:502:1: warning: ‘instr_each_src_and_dest_is_ssa’ defined but not used [-Wunused-function] instr_each_src_and_dest_is_ssa(nir_instr *instr) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv: Eliminate dead input/output variables after translation.Kenneth Graunke2019-02-141-5/+20
| | | | | | | | | | | | | | | spirv_to_nir can generate input/output variables which are illegal for the current shader stage, which would cause nir_validate_shader to balk. After my recent commit to start decorating arrays as compact, dEQP-VK.spirv_assembly.instruction.graphics.module.same_module started hitting validation errors due to outputs in a TCS (not intended for the TCS at all) not being per-vertex arrays. Thanks to Jason Ekstrand for suggesting this approach. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109573 Fixes: ef99f4c8d17 compiler: Mark clip/cull distance arrays as compact before lowering. Reviewed-by: Juan A. Suarez <[email protected]>
* anv: Put MOCS in the correct locationKenneth Graunke2019-02-141-2/+2
| | | | | | | | | | | My patch to switch from struct-based MOCS to numeric MOCS accidentally divided all MOCS entries by 2 in the Vulkan driver. MOCS on Gen9+ is just an array index into a table. But in the hardware packets, the index starts at bit 1. So we need to shift it. Fixes: 0b44644ca68 (genxml: Consistently use a numeric "MOCS" field) Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Add missing breakIan Romanick2019-02-141-0/+1
| | | | | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Fixes: c6465fec0c5 ("spirv: add SpvCapabilityInt64Atomics") CID: 1442555
* util/tests: compile to something sensible in release buildsEric Engestrom2019-02-1412-0/+24
| | | | | | | | assert()-based tests make no sense without asserts, so make sure asserts are compiled in, even if the rest of the code has asserts turned off. Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>