Merging r250085:

r250085  Andrea_DiBiagio  20151012 15:22:30 0400 (Mon, 12 Oct 2015)  60 lines
[x86] Fix wrong lowering of vsetcc nodes (PR25080).
Function LowerVSETCC (in X86ISelLowering.cpp) worked under the wrong
assumption that for nonAVX512 targets, the source type and destination type
of a typelegalized setcc node were always the same type.
This assumption was unfortunately incorrect; the type legalizer is not always
able to promote the return type of a setcc to the same type as the first
operand of a setcc.
In the case of a vsetcc node, the legalizer firstly checks if the first input
operand has a legal type. If so, then it promotes the return type of the vsetcc
to that same type. Otherwise, the return type is promoted to the 'next legal
type', which, for vectors of MVT::i1 is always a 128bit integer vector type.
Example (mattr=+avx):
%0 = trunc <8 x i32> %a to <8 x i23>
%1 = icmp eq <8 x i23> %0, zeroinitializer
The initial selection dag for the code above is:
v8i1 = setcc t5, t7, seteq:ch
t5: v8i23 = truncate t2
t2: v8i32,ch = CopyFromReg t0, Register:v8i32 %vreg1
t7: v8i32 = build_vector of all zeroes.
The type legalizer would firstly check if 't5' has a legal type. If so, then it
would reuse that same type to promote the return type of the setcc node.
Unfortunately 't5' is of illegal type v8i23, and therefore it cannot be used to
promote the return type of the setcc node. Consequently, the setcc return type
is promoted to v8i16. Later on, 't5' is promoted to v8i32 thus leading to the
following dag node:
v8i16 = setcc t32, t25, seteq:ch
where t32 and t25 are now values of type v8i32.
Before this patch, function LowerVSETCC would have wrongly expanded the setcc
to a single X86ISD::PCMPEQ. Surprisingly, ISel was still able to match an
instruction. In our case, ISel would have matched a VPCMPEQWrr:
t37: v8i16 = X86ISD::VPCMPEQWrr t36, t25
However, t36 and t25 are both VR256, while the result type is instead of class
VR128. This inconsistency ended up causing the insertion of COPY instructions
like this:
%vreg7<def> = COPY %vreg3; VR128:%vreg7 VR256:%vreg3
Which is an invalid full copy (not a sub register copy).
Eventually, the backend would have hit an UNREACHABLE "Cannot emit physreg copy
instruction" in the attempt to expand the malformed pseudo COPY instructions.
This patch fixes the problem adding the missing logic in LowerVSETCC to handle
the corner case of a setcc with 128bit return type and 256bit operand type.
This problem was originally reported by Dimitry as PR25080. It has been latent
for a very long time. I have added the minimal reproducible from that bugzilla
as test setcclowering.ll.
Differential Revision: http://reviews.llvm.org/D13660

gitsvnid: https://llvm.org/svn/llvmproject/llvm/branches/release_37@252484 911773080d340410b5e696231b3b80d8
Tom Stellard
4 years ago
13571  13571 
return DAG.getNode(Opc, dl, VT, Op0, Op1,

13572  13572 
DAG.getConstant(SSECC, dl, MVT::i8));

13573  13573 
}

 13574 

 13575 
MVT VTOp0 = Op0.getSimpleValueType();

 13576 
assert(VTOp0 == Op1.getSimpleValueType() &&

 13577 
"Expected operands with same type!");

 13578 
assert(VT.getVectorNumElements() == VTOp0.getVectorNumElements() &&

 13579 
"Invalid number of packed elements for source and destination!");

 13580 

 13581 
if (VT.is128BitVector() && VTOp0.is256BitVector()) {

 13582 
// On nonAVX512 targets, a vector of MVT::i1 is promoted by the type

 13583 
// legalizer to a wider vector type. In the case of 'vsetcc' nodes, the

 13584 
// legalizer firstly checks if the first operand in input to the setcc has

 13585 
// a legal type. If so, then it promotes the return type to that same type.

 13586 
// Otherwise, the return type is promoted to the 'next legal type' which,

 13587 
// for a vector of MVT::i1 is always a 128bit integer vector type.

 13588 
//

 13589 
// We reach this code only if the following two conditions are met:

 13590 
// 1. Both return type and operand type have been promoted to wider types

 13591 
// by the type legalizer.

 13592 
// 2. The original operand type has been promoted to a 256bit vector.

 13593 
//

 13594 
// Note that condition 2. only applies for AVX targets.

 13595 
SDValue NewOp = DAG.getSetCC(dl, VTOp0, Op0, Op1, SetCCOpcode);

 13596 
return DAG.getZExtOrTrunc(NewOp, dl, VT);

 13597 
}

 13598 

 13599 
// The nonAVX512 code below works under the assumption that source and

 13600 
// destination types are the same.

 13601 
assert((Subtarget>hasAVX512()  (VT == VTOp0)) &&

 13602 
"Value types for source and destination must be the same!");

13574  13603 

13575  13604 
// Break 256bit integer vector compare into smaller ones.

13576  13605 
if (VT.is256BitVector() && !Subtarget>hasInt256())

 0 
; RUN: llc mtriple=x86_64unknownunknown mattr=+avx < %s  FileCheck %s

 1 

 2 
; Verify that we don't crash during codegen due to a wrong lowering

 3 
; of a setcc node with illegal operand types and return type.

 4 

 5 
define <8 x i16> @pr25080(<8 x i32> %a) {

 6 
; CHECKLABEL: pr25080:

 7 
; CHECK: # BB#0: # %entry

 8 
; CHECKNEXT: vandps {{.*}}(%rip), %ymm0, %ymm0

 9 
; CHECKNEXT: vextractf128 $1, %ymm0, %xmm1

 10 
; CHECKNEXT: vpxor %xmm2, %xmm2, %xmm2

 11 
; CHECKNEXT: vpcmpeqd %xmm2, %xmm1, %xmm1

 12 
; CHECKNEXT: vmovdqa {{.*#+}} xmm3 = [0,1,4,5,8,9,12,13,8,9,12,13,12,13,14,15]

 13 
; CHECKNEXT: vpshufb %xmm3, %xmm1, %xmm1

 14 
; CHECKNEXT: vpcmpeqd %xmm2, %xmm0, %xmm0

 15 
; CHECKNEXT: vpshufb %xmm3, %xmm0, %xmm0

 16 
; CHECKNEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]

 17 
; CHECKNEXT: vpor {{.*}}(%rip), %xmm0, %xmm0

 18 
; CHECKNEXT: vpsllw $15, %xmm0, %xmm0

 19 
; CHECKNEXT: vpsraw $15, %xmm0, %xmm0

 20 
; CHECKNEXT: vzeroupper

 21 
; CHECKNEXT: retq

 22 
entry:

 23 
%0 = trunc <8 x i32> %a to <8 x i23>

 24 
%1 = icmp eq <8 x i23> %0, zeroinitializer

 25 
%2 = or <8 x i1> %1,

 26 
%3 = sext <8 x i1> %2 to <8 x i16>

 27 
ret <8 x i16> %3

 28 
}
