Skip to content

Commit 099d38e

Browse files
committed
Fix VPERMUTE for input and returned vectors having different sizes
Example: VPERMUTE <2 x float>, <2 x float>, (0, 1, 2, 3), <4 x float> <- return dtype This example shuffles the contents from both <2 x float> vectors into a single <4 x float> vector. Previously, in this example, the <2 x float>s were bitcasted into <4 x float>s, which triggers an internal error. This change prevents that bitcast from happening.
1 parent 936fcb9 commit 099d38e

File tree

4 files changed

+30
-14
lines changed

4 files changed

+30
-14
lines changed

tools/flang2/flang2exe/cgmain.cpp

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9146,7 +9146,7 @@ gen_llvm_expr(int ilix, LL_Type *expected_type)
91469146
break;
91479147
case IL_VPERMUTE: {
91489148
OPERAND *op1;
9149-
LL_Type *vect_lltype, *int_type, *select_type;
9149+
LL_Type *vect_lltype, *int_type, *select_type, *op_lltype;
91509150
DTYPE vect_dtype = ili_get_vect_dtype(ilix);
91519151
int mask_ili, num_elem;
91529152

@@ -9159,6 +9159,12 @@ gen_llvm_expr(int ilix, LL_Type *expected_type)
91599159
lhs_ili = ILI_OPND(ilix, 1);
91609160
rhs_ili = ILI_OPND(ilix, 2);
91619161
mask_ili = ILI_OPND(ilix, 3);
9162+
op_lltype = make_lltype_from_dtype(ili_get_vect_dtype(lhs_ili));
9163+
9164+
if ((*vect_lltype->sub_types)->data_type != (*op_lltype->sub_types)->data_type) {
9165+
assert(0, "VPERMUTE: result and operand dtypes must have matching base type.", 0, ERR_Severe);
9166+
}
9167+
91629168
if (expected_type && ILI_OPC(rhs_ili) == IL_NULL &&
91639169
ILI_OPC(lhs_ili) == IL_VCMP && !internal_masked_intrinsic) {
91649170
num_elem = expected_type->sub_elements;
@@ -9167,12 +9173,16 @@ gen_llvm_expr(int ilix, LL_Type *expected_type)
91679173
vect_lltype = select_type;
91689174
} else
91699175
select_type = NULL;
9170-
op1 = gen_llvm_expr(lhs_ili, vect_lltype);
9176+
op1 = gen_llvm_expr(lhs_ili, op_lltype);
91719177
if (ILI_OPC(rhs_ili) == IL_NULL) /* a don't care, generate an undef */
91729178
op1->next = make_undef_op(op1->ll_type);
91739179
else
9174-
op1->next = gen_llvm_expr(rhs_ili, vect_lltype);
9180+
op1->next = gen_llvm_expr(rhs_ili, op_lltype);
91759181
op1->next->next = gen_llvm_expr(mask_ili, 0);
9182+
if (vect_lltype->sub_elements != op1->next->next->ll_type->sub_elements) {
9183+
assert(0, "VPERMUTE: result and mask must have the same number of elements.",
9184+
vect_lltype->sub_elements, ERR_Severe);
9185+
}
91769186
operand = ad_csed_instr(I_SHUFFVEC, ilix, vect_lltype, op1,
91779187
InstrListFlagsNull, true);
91789188
/* This next case is where the VPERMUTE is used to expand a half-size

tools/flang2/utils/ilitp/aarch64/ilitp.n

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5239,11 +5239,13 @@ Vector FMA for LLVM intrinsic - -lnk1*lnk2-lnk3, with stc the dtype
52395239
.AT arth null lnk cse vect
52405240
.CG notCG
52415241
.IL VPERMUTE lnk lnk lnk stc
5242-
Shuffle contents of vector registers. lnk1 and link2 can be same vector,
5243-
with result in link2. stc4 is the dtype, link3 is a vector constant
5242+
Shuffle contents of vector registers. lnk1 and lnk2 can be the same vector
5243+
or lnk2 can be null. lnk1 dtype is used as dtype for both lnk1 and lnk2,
5244+
unless lnk2 is null. stc is the result dtype, lnk3 is a vector constant
52445245
representing a mask where each field represents which L-to-R element of
5245-
concatenated <link1,link2> vector is to be placed in corresponding result
5246-
field.
5246+
concatenated <lnk1,lnk2> vector is to be placed in corresponding result
5247+
field. lnk3 size must match the size of the result vector, but can be
5248+
different than lnk1 and lnk2's size.
52475249
.AT other null lnk vect
52485250
.CG notCG
52495251
.IL VBLEND lnk lnk lnk stc

tools/flang2/utils/ilitp/ppc64le/ilitp.n

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5219,11 +5219,13 @@ Vector FMA for LLVM intrinsic - -lnk1*lnk2-lnk3, with stc the dtype
52195219
.AT arth null lnk cse vect
52205220
.CG notCG
52215221
.IL VPERMUTE lnk lnk lnk stc
5222-
Shuffle contents of vector registers. lnk1 and link2 can be same vector,
5223-
with result in link2. stc4 is the dtype, link3 is a vector constant
5222+
Shuffle contents of vector registers. lnk1 and lnk2 can be the same vector
5223+
or lnk2 can be null. lnk1 dtype is used as dtype for both lnk1 and lnk2,
5224+
unless lnk2 is null. stc is the result dtype, lnk3 is a vector constant
52245225
representing a mask where each field represents which L-to-R element of
5225-
concatenated <link1,link2> vector is to be placed in corresponding result
5226-
field.
5226+
concatenated <lnk1,lnk2> vector is to be placed in corresponding result
5227+
field. lnk3 size must match the size of the result vector, but can be
5228+
different than lnk1 and lnk2's size.
52275229
.AT other null lnk vect
52285230
.CG notCG
52295231
.IL VBLEND lnk lnk lnk stc

tools/flang2/utils/ilitp/x86_64/ilitp.n

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6239,11 +6239,13 @@ Vector FMA for LLVM intrinsic - -lnk1*lnk2-lnk3, with stc the dtype
62396239
.AT arth null lnk cse vect
62406240
.CG notCG
62416241
.IL VPERMUTE lnk lnk lnk stc
6242-
Shuffle contents of vector registers. lnk1 and lnk2 can be same vector,
6243-
with result in lnk2. stc is the dtype, lnk3 is a vector constant
6242+
Shuffle contents of vector registers. lnk1 and lnk2 can be the same vector
6243+
or lnk2 can be null. lnk1 dtype is used as dtype for both lnk1 and lnk2,
6244+
unless lnk2 is null. stc is the result dtype, lnk3 is a vector constant
62446245
representing a mask where each field represents which L-to-R element of
62456246
concatenated <lnk1,lnk2> vector is to be placed in corresponding result
6246-
field.
6247+
field. lnk3 size must match the size of the result vector, but can be
6248+
different than lnk1 and lnk2's size.
62476249
.AT other null lnk vect
62486250
.CG notCG
62496251
.IL VBLEND lnk lnk lnk stc

0 commit comments

Comments
 (0)