Skip to content

Commit ae1a5dd

Browse files
ustachowigcbot
authored andcommitted
Fix invalid horizontal stride 0 for destination in predicated sub-DW operations
Fixed invalid horizontal stride 0 for destination operand in emitLSCVectorLoad_subDW() function. When doUniformLoad is true, the code was setting hStride to 0 for destination operand, which violates Intel GPU architecture constraints and causes instruction validation errors for i8/i16 predicated load/store operations. Modified the logic to use default stride for uniform loads and appropriate stride (2 or 4) for non-uniform loads, eliminating the invalid stride 0 case.
1 parent 460d83f commit ae1a5dd

File tree

2 files changed

+5
-3
lines changed

2 files changed

+5
-3
lines changed

IGC/Compiler/CISACodeGen/EmitVISAPass.cpp

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17717,8 +17717,10 @@ void EmitPass::emitLSCVectorLoad_subDW(LSC_CACHE_OPTS CacheOpts, bool UseA32, Re
1771717717
// Copy merge value to gatherDst now, to avoid predicated copy to original
1771817718
// destination later
1771917719
if (mergeVal) {
17720-
uint32_t hStride = doUniformLoad ? 0 : ((EltBytes == 1) ? 4 : 2);
17721-
m_encoder->SetDstRegion(hStride);
17720+
if (!doUniformLoad) {
17721+
uint32_t hStride = (EltBytes == 1) ? 4 : 2;
17722+
m_encoder->SetDstRegion(hStride);
17723+
}
1772217724
m_encoder->Copy(gatherDstAlias, mergeVal);
1772317725
m_encoder->Push();
1772417726
}

IGC/Compiler/tests/EmitVISAPass/predicated-load-subdw.ll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ entry:
2323
; CHECK: .decl [[G_ALIAS1:.*]] v_type=G type=b num_elts=4 align=wordx32 alias=<[[GATHER1:.*]], 0>
2424

2525
; copy merge value. do predicated load, copy result
26-
; CHECK: mov (M1_NM, 1) [[G_ALIAS0]](0,0)<0> 0x0:b
26+
; CHECK: mov (M1_NM, 1) [[G_ALIAS0]](0,0)<1> 0x0:b
2727
; CHECK: (P1) lsc_load.ugm (M1_NM, 1) [[GATHER0]]:d8u32 flat[
2828
; CHECK: mov (M1_NM, 1) res0(0,0)<1> [[G_ALIAS0]](0,0)<0;1,0>
2929
%res0 = call i8 @llvm.genx.GenISA.PredicatedLoad.i8.p1.i8(ptr addrspace(1) %in, i64 1, i1 %p, i8 0)

0 commit comments

Comments
 (0)