target/arm: Fix SME FMOPA (16-bit), BFMOPA
Perform the loop increment unconditionally, not nested
within the predication.
Cc: qemu-stable@nongnu.org
Fixes: 3916841ac75 ("target/arm: Implement FMOPA, FMOPS (widening)")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1985
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231117193135.1180657-1-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c
index 296826f..1ee2690 100644
--- a/target/arm/tcg/sme_helper.c
+++ b/target/arm/tcg/sme_helper.c
@@ -1037,10 +1037,9 @@
m = f16mop_adj_pair(m, pcol, 0);
*a = f16_dotadd(*a, n, m, &fpst_std, &fpst_odd);
-
- col += 4;
- pcol >>= 4;
}
+ col += 4;
+ pcol >>= 4;
} while (col & 15);
}
row += 4;
@@ -1073,10 +1072,9 @@
m = f16mop_adj_pair(m, pcol, 0);
*a = bfdotadd(*a, n, m);
-
- col += 4;
- pcol >>= 4;
}
+ col += 4;
+ pcol >>= 4;
} while (col & 15);
}
row += 4;