summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* intel/vec4: Try to emit immediate sources for MOVIan Romanick2019-07-111-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Per the comment in vec4_visitor::nir_emit_load_const, further improvement is possible in this area. That case would be more complicated as I think we'd want to check that all users of the nir_load_const_instr result intended to use the value as float. No shader-db changes on any Gen8+ platform as these platforms do not use the vec4 backend. v2: Massive rebase on eeebeb211f1 ("intel/vec4: Try emitting non-scalar immediates"). This commit is about twice as helpful since b04beaf41d2 ("intel/vec4: Try both sources as candidates for being immediates"). Haswell and Ivy Bridge had similar results. (Haswell shown) total instructions in shared programs: 13478598 -> 13474068 (-0.03%) instructions in affected programs: 589452 -> 584922 (-0.77%) helped: 2773 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 1.63 x̃: 1 helped stats (rel) min: 0.16% max: 5.66% x̄: 0.96% x̃: 0.83% 95% mean confidence interval for instructions value: -1.67 -1.60 95% mean confidence interval for instructions %-change: -0.98% -0.94% Instructions are helped. total cycles in shared programs: 376386916 -> 376369392 (<.01%) cycles in affected programs: 16871628 -> 16854104 (-0.10%) helped: 2293 HURT: 523 helped stats (abs) min: 2 max: 812 x̄: 13.80 x̃: 2 helped stats (rel) min: <.01% max: 10.18% x̄: 1.02% x̃: 0.36% HURT stats (abs) min: 2 max: 316 x̄: 26.99 x̃: 14 HURT stats (rel) min: <.01% max: 19.34% x̄: 2.15% x̃: 1.43% 95% mean confidence interval for cycles value: -7.87 -4.58 95% mean confidence interval for cycles %-change: -0.52% -0.34% Cycles are helped. Sandy Bridge total instructions in shared programs: 10860328 -> 10857675 (-0.02%) instructions in affected programs: 335907 -> 333254 (-0.79%) helped: 1639 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.62 x̃: 1 helped stats (rel) min: 0.10% max: 5.26% x̄: 0.86% x̃: 0.70% 95% mean confidence interval for instructions value: -1.67 -1.57 95% mean confidence interval for instructions %-change: -0.89% -0.84% Instructions are helped. total cycles in shared programs: 153942720 -> 153934120 (<.01%) cycles in affected programs: 5604818 -> 5596218 (-0.15%) helped: 1494 HURT: 97 helped stats (abs) min: 2 max: 256 x̄: 7.84 x̃: 2 helped stats (rel) min: 0.01% max: 6.62% x̄: 0.35% x̃: 0.18% HURT stats (abs) min: 2 max: 160 x̄: 32.02 x̃: 20 HURT stats (rel) min: 0.02% max: 3.37% x̄: 0.88% x̃: 0.56% 95% mean confidence interval for cycles value: -6.45 -4.36 95% mean confidence interval for cycles %-change: -0.32% -0.23% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8139378 -> 8137267 (-0.03%) instructions in affected programs: 265616 -> 263505 (-0.79%) helped: 1148 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.84 x̃: 1 helped stats (rel) min: 0.22% max: 4.76% x̄: 0.87% x̃: 0.62% 95% mean confidence interval for instructions value: -1.90 -1.78 95% mean confidence interval for instructions %-change: -0.90% -0.83% Instructions are helped. total cycles in shared programs: 188541756 -> 188537540 (<.01%) cycles in affected programs: 9807004 -> 9802788 (-0.04%) helped: 1143 HURT: 4 helped stats (abs) min: 2 max: 10 x̄: 3.70 x̃: 2 helped stats (rel) min: <.01% max: 3.01% x̄: 0.13% x̃: 0.06% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.18% max: 0.18% x̄: 0.18% x̃: 0.18% 95% mean confidence interval for cycles value: -3.80 -3.55 95% mean confidence interval for cycles %-change: -0.14% -0.12% Cycles are helped. Reviewed-by: Matt Turner <[email protected]>
* intel/vec4: Try to emit a VF source in try_immediate_sourceIan Romanick2019-07-111-12/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit is also a pre-requisite for the next commit. No shader-db changes on any Gen8+ platform as these platforms do not use the vec4 backend. v2: Massive rebase on eeebeb211f1 ("intel/vec4: Try emitting non-scalar immediates"). This change is a lot less helpful since that commit landed (previously helped 1934 shaders on HSW) because, apparently, a lot of the cases helped by that commit were things like vector loads of { 1.0, 1.0, 1.0 } that were also helped by this commit. Haswell total instructions in shared programs: 13480095 -> 13478598 (-0.01%) instructions in affected programs: 229534 -> 228037 (-0.65%) helped: 1006 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 1.49 x̃: 1 helped stats (rel) min: 0.04% max: 3.45% x̄: 1.11% x̃: 1.09% 95% mean confidence interval for instructions value: -1.54 -1.43 95% mean confidence interval for instructions %-change: -1.15% -1.07% Instructions are helped. total cycles in shared programs: 376385734 -> 376386916 (<.01%) cycles in affected programs: 14101380 -> 14102562 (<.01%) helped: 941 HURT: 56 helped stats (abs) min: 2 max: 322 x̄: 5.62 x̃: 2 helped stats (rel) min: <.01% max: 7.74% x̄: 0.51% x̃: 0.42% HURT stats (abs) min: 2 max: 618 x̄: 115.50 x̃: 32 HURT stats (rel) min: 0.03% max: 4.62% x̄: 0.83% x̃: 0.44% 95% mean confidence interval for cycles value: -2.06 4.43 95% mean confidence interval for cycles %-change: -0.47% -0.39% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 12048004 -> 12046589 (-0.01%) instructions in affected programs: 217072 -> 215657 (-0.65%) helped: 934 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 1.51 x̃: 1 helped stats (rel) min: 0.04% max: 3.45% x̄: 1.14% x̃: 1.11% 95% mean confidence interval for instructions value: -1.57 -1.46 95% mean confidence interval for instructions %-change: -1.18% -1.10% Instructions are helped. total cycles in shared programs: 180285854 -> 180287608 (<.01%) cycles in affected programs: 14103824 -> 14105578 (0.01%) helped: 871 HURT: 53 helped stats (abs) min: 2 max: 322 x̄: 5.51 x̃: 2 helped stats (rel) min: <.01% max: 7.67% x̄: 0.50% x̃: 0.42% HURT stats (abs) min: 2 max: 618 x̄: 123.66 x̃: 32 HURT stats (rel) min: 0.03% max: 4.47% x̄: 0.92% x̃: 0.46% 95% mean confidence interval for cycles value: -1.60 5.39 95% mean confidence interval for cycles %-change: -0.46% -0.37% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10861227 -> 10860328 (<.01%) instructions in affected programs: 92969 -> 92070 (-0.97%) helped: 624 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.11% max: 3.45% x̄: 1.05% x̃: 0.95% 95% mean confidence interval for instructions value: -1.52 -1.36 95% mean confidence interval for instructions %-change: -1.09% -1.01% Instructions are helped. total cycles in shared programs: 153944316 -> 153942720 (<.01%) cycles in affected programs: 1640956 -> 1639360 (-0.10%) helped: 601 HURT: 15 helped stats (abs) min: 2 max: 120 x̄: 3.56 x̃: 2 helped stats (rel) min: 0.02% max: 6.33% x̄: 0.18% x̃: 0.08% HURT stats (abs) min: 2 max: 72 x̄: 36.13 x̃: 36 HURT stats (rel) min: 0.05% max: 3.84% x̄: 1.95% x̃: 2.00% 95% mean confidence interval for cycles value: -3.44 -1.74 95% mean confidence interval for cycles %-change: -0.18% -0.09% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8139924 -> 8139378 (<.01%) instructions in affected programs: 69776 -> 69230 (-0.78%) helped: 322 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 1.70 x̃: 1 helped stats (rel) min: 0.27% max: 3.23% x̄: 0.79% x̃: 0.54% 95% mean confidence interval for instructions value: -1.88 -1.51 95% mean confidence interval for instructions %-change: -0.85% -0.72% Instructions are helped. total cycles in shared programs: 188542864 -> 188541756 (<.01%) cycles in affected programs: 3031532 -> 3030424 (-0.04%) helped: 320 HURT: 0 helped stats (abs) min: 2 max: 20 x̄: 3.46 x̃: 2 helped stats (rel) min: <.01% max: 0.69% x̄: 0.06% x̃: 0.06% 95% mean confidence interval for cycles value: -3.85 -3.07 95% mean confidence interval for cycles %-change: -0.06% -0.05% Cycles are helped. Reviewed-by: Matt Turner <[email protected]>
* intel/vec4: Try to emit a single load for multiple 3-src instruction operandsIan Romanick2019-07-112-4/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a 3-source instruction uses immediate values 1.0 and -1.0, just load 1.0 into a register. Use the negation source modifier to get -1.0. This has trivial impact now, but it prevents a few thousand regressions on vec4 platforms with "nir/algebraic: Recognize open-coded flrp(-1, 1, a) and flrp(1, -1, a)" All Gen6 and Gen7 platforms had similar results. (Haswell shown) total instructions in shared programs: 13487412 -> 13487406 (<.01%) instructions in affected programs: 541 -> 535 (-1.11%) helped: 6 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.36% max: 2.08% x̄: 1.65% x̃: 1.80% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -2.33% -0.97% Instructions are helped. total cycles in shared programs: 376402564 -> 376402500 (<.01%) cycles in affected programs: 10348 -> 10284 (-0.62%) helped: 10 HURT: 1 helped stats (abs) min: 2 max: 26 x̄: 7.00 x̃: 2 helped stats (rel) min: 0.13% max: 2.05% x̄: 0.89% x̃: 0.79% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.29% max: 0.29% x̄: 0.29% x̃: 0.29% 95% mean confidence interval for cycles value: -11.72 0.08 95% mean confidence interval for cycles %-change: -1.20% -0.36% Inconclusive result (value mean confidence interval includes 0). No shader-db changes on any other Intel platform. Reviewed-by: Matt Turner <[email protected]>
* intel/vec4: Refactor operand fixing for ffma and flrpIan Romanick2019-07-112-8/+16
| | | | Reviewed-by: Matt Turner <[email protected]>
* panfrost: Wire up GLES2-class polygon offsetAlyssa Rosenzweig2019-07-111-0/+11
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Depth units/factor are identical to GLAlyssa Rosenzweig2019-07-112-10/+2
| | | | | | | I'm not sure why I thoughtt here was an off-by-one, other than maybe bad data collection. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* etnaviv: remove dead translate_ts_sampler_format(..) declarationChristian Gmeiner2019-07-111-3/+0
| | | | | | | Fixes: 66411521ea9 ("etnaviv: combine translate_ts_sampler_format/translate_msaa_format") Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Jonathan Marek <[email protected]>
* intel/fs: Add support for SLM fence in Gen11Caio Marcelo de Oliveira Filho2019-07-116-12/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Gen11 SLM is not on L3 anymore, so now the hardware has two separate fences. Add a way to control which fence types to use. At this time, we don't have enough information in NIR to control the visibility of the memory being fenced, so for now be conservative and assume that fences will need a stall. With more information later we'll be able to reduce those. Fixes Vulkan CTS tests in ICL: dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_nonlocal.workgroup.guard_local.buffer.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.buffer.guard_nonlocal.workgroup.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.image.guard_nonlocal.workgroup.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.buffer.guard_nonlocal.workgroup.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.image.guard_nonlocal.workgroup.comp The whole set of supported tests in dEQP-VK.memory_model.* group should be passing in ICL now. v2: Pass BTI around instead of having an enum. (Jason) Emit two SHADER_OPCODE_MEMORY_FENCE instead of one that gets transformed into two. (Jason) List tests fixed. (Lionel) v3: For clarity, split the decision of which fences to emit from the emission code. (Jason) Reviewed-by: Jason Ekstrand <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
* Revert "panfrost/midgard: Use _safe iterator"Tomeu Vizoso2019-07-111-1/+1
| | | | | | | | | This reverts commit 812ce2ce9e5655613eae740926176509897122fa. We massively regress with the reverted patch. So in the meantime, take it out. Signed-off-by: Tomeu Vizoso <[email protected]>
* panfrost: Don't lie about Z/S formatsAlyssa Rosenzweig2019-07-114-6/+34
| | | | | | | | | | | Only Z24S8 is properly supported right now, so let's be careful. Fixes a number of issues relating to improper Z/S handling. The most obvious is depth buffers with incorrect strides, which manifests in truly bizarre ways and can happen commonly with FBOs. Fixes WebGL (Aquarium runs, etc). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* radv/gfx10: enable geometry shadersSamuel Pitoiset2019-07-111-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: Fix NGG GS output mask handlings for LDS indexing.Bas Nieuwenhuizen2019-07-111-1/+5
| | | | | | | | | | | | In emit_vertex we optimize storage if the output mask does not have all bits set. Do the same in the epilogue so the indices actually match up. Fixes dEQP-VK.geometry.input.basic_primitive.points because it outputs PSIZE with an output mask of 1, which cause the generic attribute for the color to be loaded from the wrong indices. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/gfx10: Simplify output mask handling for NGG GS.Bas Nieuwenhuizen2019-07-111-12/+1
| | | | | | We only ever get in this function for a NGG GS proper. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/gfx10: Do GS prologue outside of gs_threads if.Bas Nieuwenhuizen2019-07-111-5/+6
| | | | | | Mirror radeonsi. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/gfx10: implement support for GS as NGGSamuel Pitoiset2019-07-114-6/+568
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: Use correct ES shader for es_vgpr_comp_cnt for GS.Bas Nieuwenhuizen2019-07-111-2/+5
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/gfx10: Do not allocate a gs_copy_shader on gfx10.Bas Nieuwenhuizen2019-07-112-4/+6
| | | | | | Will use ngg for any gs anyway. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/gfx10: fix VGT_SHADER_STAGES_EN for GS as NGGSamuel Pitoiset2019-07-111-10/+11
| | | | | | | The driver shouldn't set the copy shader bit. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: fix number of GS invocations for NGGSamuel Pitoiset2019-07-111-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* panfrost/midgard: Use _safe iteratorTomeu Vizoso2019-07-111-1/+1
| | | | | | | | | | Fixes this assertion: ../mesa/src/panfrost/midgard/midgard_schedule.c:507:schedule_block: Assertion `ins == __next && "use _safe iterator"' failed. Trace/breakpoint trap Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Place the height value in the height fieldTomeu Vizoso2019-07-111-1/+1
| | | | | | | | | In the mali_single_framebuffer descriptor. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> v2: Remove unwanted chunks
* radv/gfx10: fix maximum number of mip levels for 3D imagesSamuel Pitoiset2019-07-111-4/+10
| | | | | | | | | | The dimensions also have to be adjusted if the number of supported mip levels is changed. This fixes dEQP-VK.api.info.image_format_properties.3d.*. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: disable TC-compat HTILE for multisampled D32_SFLOAT formatSamuel Pitoiset2019-07-111-2/+5
| | | | | | | | | | For some reasons D32_SFLOAT is also affected on GFX10, it works fine with previous generations. This fixes some dEQP-VK.renderpass2.depth_stencil_resolve.*. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* iris: Fix key->input_vertices for 8_PATCH TCS mode.Kenneth Graunke2019-07-112-1/+10
| | | | | | We were failing to flag the program dirty when it changed. Also, we were unnecessarily setting key->input_vertices for SINGLE_PATCH mode, which would reduce program cache hits. Only set it if needed.
* iris: Only set key->flat_shade if COL0/COL1 are written.Kenneth Graunke2019-07-113-3/+5
| | | | | | This was just laziness on my part, we already added similar checks in the VS key handling. Just need to do it here too. Should improve cache hits.
* iris: Drop comment about var->data.binding not being set.Kenneth Graunke2019-07-111-4/+0
| | | | | I refactored the sampler lowering passes a long time ago to ensure that gl_nir_lower_samplers_as_deref is run and var->data.binding is set.
* iris: Drop comments about missing NOSKenneth Graunke2019-07-111-6/+0
| | | | | These stages don't need NOS. If they do, we can add it - the infrastructure is there if we need it someday.
* iris: Drop a TODO commentKenneth Graunke2019-07-111-1/+0
| | | | This is literally implemented two lines above.
* glsl/builtin types: Set the precision on the depth range paramsNeil Roberts2019-07-111-3/+3
| | | | | | | The members of gl_DepthRangeParameters are declared to be highp in GLSL ES specs. Reviewed-by: Eric Anholt <[email protected]>
* glsl: Add a constructor for glsl_struct_field to specify the precisionNeil Roberts2019-07-111-4/+12
| | | | | | | Adds a third constructor to glsl_struct_field which has an extra parameter to specify the precision. Reviewed-by: Eric Anholt <[email protected]>
* glsl: Add a macro for the default values for glsl_struct_fieldNeil Roberts2019-07-111-14/+12
| | | | | | | | | There are two constructors for glsl_struct_field with different parameters. Instead of repeating them for both constructors, this patch adds a convenience macro. This will make it easier to add a third constructor in a later patch. Reviewed-by: Eric Anholt <[email protected]>
* glsl/builtin_variables: Add a precision to the builtinsNeil Roberts2019-07-111-80/+170
| | | | | | | | | | | | | | | All of the builtin variables mentioned in the GLSL ES spec and the extensions include a precision declaration which is different depending on what the variable is used for. This patch makes it set the corresponding precision when creating the variable. This will make a difference once we start using the precision information for optimisation. Previously all of the builtin variables ended up with a precision of NONE. v2: Made gl_PointSize and gl_FragCoord highp since GLSL ES 3.00. Fixed gl_MaxViewPorts to always be highp. (Eric Anholt) Reviewed-by: Eric Anholt <[email protected]>
* compiler: Save a single copy of the softfp64 shader in the context.Kenneth Graunke2019-07-104-11/+14
| | | | | | | | | | | | | | | | | | We were recompiling the softfp64 library of functions from GLSL to NIR every time we compiled a shader that used fp64. Worse, we were ralloc stealing it to the GL context. This meant that we'd accumulate lots of copies for the lifetime of the context, which was a big space leak. Instead, we can simply stash a single copy in the GL context, and use it for subsequent compiles. Having a single copy should be fine from a memory context point of view: nir_inline_function_impl already clones the necessary nir_function_impl's as it inlines. KHR-GL45.enhanced_layouts.ssb_member_align_non_power_of_2 was previously OOM'ing a system with 16GB of RAM when using softfp64. Now it finishes much more quickly and uses only ~200MB of RAM. Reviewed-by: Jordan Justen <[email protected]>
* radv: fix memory leak when restoring from cacheTimothy Arceri2019-07-111-0/+1
| | | | | | Fixes: 726a31df705b ("radv: Add the concept of radv shader binaries.") Reviewed-by: Samuel Pitoiset <[email protected]>
* freedreno: Generate headers from xml filesKristian H. Kristensen2019-07-1029-23904/+14118
| | | | | Reviewed-by: Eric Engestrom <[email protected]> Acked-by: Rob Clark <[email protected]>
* radv: switch to the new VS exports pathSamuel Pitoiset2019-07-101-116/+2
| | | | | | | It will help for GS as NGG on GFX10. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set the slot_index correctly for VARYING_SLOT_CLIP_DIST1Samuel Pitoiset2019-07-101-1/+1
| | | | | | | For selecting a different SQ_EXP_POS target. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add a new function for exporting VS outputsSamuel Pitoiset2019-07-101-0/+128
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement new path for exporting generic varyingsSamuel Pitoiset2019-07-101-32/+70
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use the generic export path for clip/cull distancesSamuel Pitoiset2019-07-101-6/+6
| | | | | | | When they are exported to the next stage. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove an extra memcpy when exporting clip/cull distancesSamuel Pitoiset2019-07-101-6/+5
| | | | | | | Cleanup. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* intel/compiler: Add a "base class" for program keysJason Ekstrand2019-07-1030-240/+183
| | | | | | | | | Right now, all keys have two things in common: a program string ID and a sampler_prog_key_data. I'd like to add another thing or two and need a place to put it. This commit adds a new brw_base_prog_key struct which contains those two common bits. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/program_cache: Cast the key to char * before adding key_sizeJason Ekstrand2019-07-101-1/+1
| | | | | | | | We're about to change the type of key to be brw_base_prog_key and that will mean blindly adding the key size without a cast will lead to the wrong calculation. It's safer to cast to char * first anyway. Reviewed-by: Kenneth Graunke <[email protected]>
* anv: Make the workaround BO a whole pageJason Ekstrand2019-07-101-1/+1
| | | | | | | | I'm not 100% sure how this ever worked because gem_create usually shoots you if the BO size isn't page-aligned. Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Set Stateless Data Port Access MOCSJason Ekstrand2019-07-101-0/+2
| | | | | | | | | | This is the MOCS setting used for the A64 stateless messages which we sometimes use for SSBO operations. Fixes: 48ed2a7bb009 "anv: Implement VK_EXT_buffer_device_address" Fixes: 79fb0d27f3ab "anv: Implement SSBOs bindings with GPU addr..." Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* panfrost: Clamp point sizeAlyssa Rosenzweig2019-07-106-4/+83
| | | | | | | It's not clear the hardware really has a maximum which confuses dEQP; clamp to whatever we report as our maximum. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Auto styleAlyssa Rosenzweig2019-07-104-122/+123
| | | | | | $ astyle *.c *.h --style=linux -s8 Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Move non-Gallium files outside of GalliumAlyssa Rosenzweig2019-07-1038-92/+196
| | | | | | | | | In preparation for a Panfrost-based non-Gallium driver (maybe Vulkan...?), hoist everything except for the Gallium driver into a shared src/panfrost. Practically, that means the compilers, the headers, and pandecode. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Style main Gallium driverAlyssa Rosenzweig2019-07-1027-819/+819
| | | | | | $ astyle *.c *.h --style=linux -s8 Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Apply code stylingAlyssa Rosenzweig2019-07-1011-212/+213
| | | | | | $ astyle *.c *.h --style=linux -s8 Signed-off-by: Alyssa Rosenzweig <[email protected]>