target/arm: Implement new v8.1M VLLDM and VLSTM encodings

v8.1M adds new encodings of VLLDM and VLSTM (where bit 7 is set).
The only difference is that:
 * the old T1 encodings UNDEF if the implementation implements 32
   Dregs (this is currently architecturally impossible for M-profile)
 * the new T2 encodings have the implementation-defined option to
   read from memory (discarding the data) or write UNKNOWN values to
   memory for the stack slots that would be D16-D31

We choose not to make those accesses, so for us the two
instructions behave identically assuming they don't UNDEF.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20201119215617.29887-21-peter.maydell@linaro.org
diff --git a/target/arm/m-nocp.decode b/target/arm/m-nocp.decode
index ccd62e8..6699626 100644
--- a/target/arm/m-nocp.decode
+++ b/target/arm/m-nocp.decode
@@ -36,7 +36,7 @@
 
 {
   # Special cases which do not take an early NOCP: VLLDM and VLSTM
-  VLLDM_VLSTM  1110 1100 001 l:1 rn:4 0000 1010 0000 0000
+  VLLDM_VLSTM  1110 1100 001 l:1 rn:4 0000 1010 op:1 000 0000
   # VSCCLRM (new in v8.1M) is similar:
   VSCCLRM      1110 1100 1.01 1111 .... 1011 imm:7 0   vd=%vd_dp size=3
   VSCCLRM      1110 1100 1.01 1111 .... 1010 imm:8     vd=%vd_sp size=2
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
index 808b407..0db9360 100644
--- a/target/arm/translate-vfp.c.inc
+++ b/target/arm/translate-vfp.c.inc
@@ -3721,6 +3721,31 @@
         !arm_dc_feature(s, ARM_FEATURE_V8)) {
         return false;
     }
+
+    if (a->op) {
+        /*
+         * T2 encoding ({D0-D31} reglist): v8.1M and up. We choose not
+         * to take the IMPDEF option to make memory accesses to the stack
+         * slots that correspond to the D16-D31 registers (discarding
+         * read data and writing UNKNOWN values), so for us the T2
+         * encoding behaves identically to the T1 encoding.
+         */
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
+            return false;
+        }
+    } else {
+        /*
+         * T1 encoding ({D0-D15} reglist); undef if we have 32 Dregs.
+         * This is currently architecturally impossible, but we add the
+         * check to stay in line with the pseudocode. Note that we must
+         * emit code for the UNDEF so it takes precedence over the NOCP.
+         */
+        if (dc_isar_feature(aa32_simd_r32, s)) {
+            unallocated_encoding(s);
+            return true;
+        }
+    }
+
     /*
      * If not secure, UNDEF. We must emit code for this
      * rather than returning false so that this takes