llvm.org GIT mirror llvm / f7b1d9f
[X86] Enable -mprefer-vector-width=256 by default for Skylake-avx512 and later Intel CPUs. AVX512 instructions can cause a frequency drop on these CPUs. This can negate the performance gains from using wider vectors. Enabling prefer-vector-width=256 will prevent generation of zmm registers unless explicit 512 bit operations are used in the original source code. I believe gcc and icc both do something similar to this by default. Differential Revision: https://reviews.llvm.org/D67259 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371694 91177308-0d34-0410-b5e6-96231b3b80d8 Craig Topper a month ago
3 changed file(s) with 14 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
9595 be passed in ZMM registers for calls and returns. Previously they were passed
9696 in two YMM registers. Old behavior can be enabled by passing
9797 -x86-enable-old-knl-abi
98 * -mprefer-vector-width=256 is now the default behavior skylake-avx512 and later
99 Intel CPUs. This tries to limit the use of 512-bit registers which can cause a
100 decrease in CPU frequency on these CPUs. This can be re-enabled by passing
101 -mprefer-vector-width=512 to clang or passing -mattr=-prefer-256-bit to llc.
98102
99103 Changes to the AMDGPU Target
100104 -----------------------------
600600
601601 // Skylake-AVX512
602602 list SKXAdditionalFeatures = [FeatureAVX512,
603 FeaturePrefer256Bit,
603604 FeatureCDI,
604605 FeatureDQI,
605606 FeatureBWI,
633634
634635 // Cannonlake
635636 list CNLAdditionalFeatures = [FeatureAVX512,
637 FeaturePrefer256Bit,
636638 FeatureCDI,
637639 FeatureDQI,
638640 FeatureBWI,
0 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
11 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=fast-variable-shuffle,avx512vl,avx512bw,avx512dq,prefer-256-bit | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512
22 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=fast-variable-shuffle,avx512vl,avx512bw,avx512dq,prefer-256-bit,avx512vbmi | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
3 ; Make sure CPUs default to prefer-256-bit. avx512vnni isn't interesting as it just adds an isel peephole for vpmaddwd+vpaddd
4 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=skylake-avx512 | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512
5 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=cascadelake | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512
6 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=cooperlake | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512
7 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=cannonlake | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
8 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=icelake-client | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
9 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=icelake-server | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
10 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=tigerlake | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
311
412 ; This file primarily contains tests for specific places in X86ISelLowering.cpp that needed be made aware of the legalizer not allowing 512-bit vectors due to prefer-256-bit even though AVX512 is enabled.
513