Merge tag 'pull-target-arm-20240215' of https://git.linaro.org/people/pmaydell/qemu-arm into staging

target-arm queue:
 * hw/arm/xilinx_zynq: Wire FIQ between CPU <> GIC
 * linux-user/aarch64: Choose SYNC as the preferred MTE mode
 * Fix some errors in SVE/SME handling of MTE tags
 * hw/pci-host/raven.c: Mark raven_io_ops as implementing unaligned accesses
 * hw/block/tc58128: Don't emit deprecation warning under qtest
 * tests/qtest: Fix handling of npcm7xx and GMAC tests
 * hw/arm/virt: Wire up non-secure EL2 virtual timer IRQ
 * tests/qtest/npcm7xx_emc-test: Connect all NICs to a backend
 * Don't assert on vmload/vmsave of M-profile CPUs
 * hw/arm/smmuv3: add support for stage 1 access fault
 * hw/arm/stellaris: QOM cleanups
 * Use new CBAR encoding for all v8 CPUs, not all aarch64 CPUs
 * Improve Cortex_R52 IMPDEF sysreg modelling
 * Allow access to SPSR_hyp from hyp mode
 * New board model mps3-an536 (Cortex-R52)

# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmXOStQZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3vlDD/9HCq1B6e4hWQBlHrWG5nCK
# zBYRZ3GuuYCFB9FJt4EyXjUZ8ohDhLrpXvjhzSF6IqoPo0c0rQnfipeuj8Lu/hAV
# JrtHE4jQnE5Q9dwSz3A6oh0z5iIkFB/C1hW6fBDwvgAJUZ5xh5MBcxOvKq1s3WKv
# 3JngHC/KJrjgIKbcCV3Nd+OdyIZ7QZNXPwcBX9Zzt2eDkdEzOjcJYF4lisWdGav5
# JVXUeCXtClmFUZrxiGCLeTxb5X+TptxC+kAcPC7F5GjtVSy2800Z9sit2FTqd9Vd
# Y+rdA5IIBbZWPQ3OOHbaR69X4tWmc+BIT3nbQlESfV3odg0toQhe7aqn9UPIEU0K
# JRrzfodD7r7HK36lZm7ehmevLQnZgO6+MYL4Wrr0pUCNUxqVYlIyaqfsFSaknRg1
# 85L6agJlPYxtvrQtfhIV5m1V3IfyIiC7ECqMFe+QLdbR0ZxS3sI7sJ3O58xmcbDm
# SGVLl+xjAW3ZdgOb+k4B/BlPqasiJpuLe7So2e+cvDWN7OM0iJBxFAVz3yhJKGTP
# t9adJ1j0SI9XNrRuQkCX1T07Ciiuvr/mM4eY6YK+6TOq1zXks9st51ydbvEKdNW9
# YMFpWwUbYKKmUTEW06Xg5iNatse5kp4MUASF5BERkaGuyhRqLl/8p6jt6Q+9/D1S
# 5y8MFjUcdg8t4KcSJgdopw==
# =X+zR
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 15 Feb 2024 17:33:08 GMT
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# gpg:                 aka "Peter Maydell <peter@archaic.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* tag 'pull-target-arm-20240215' of https://git.linaro.org/people/pmaydell/qemu-arm: (35 commits)
  docs: Add documentation for the mps3-an536 board
  hw/arm/mps3r: Add remaining devices
  hw/arm/mps3r: Add GPIO, watchdog, dual-timer, I2C devices
  hw/arm/mps3r: Add UARTs
  hw/arm/mps3r: Add CPUs, GIC, and per-CPU RAM
  hw/arm/mps3r: Initial skeleton for mps3-an536 board
  hw/misc/mps2-scc: Make changes needed for AN536 FPGA image
  hw/misc/mps2-scc: Factor out which-board conditionals
  hw/misc/mps2-scc: Fix condition for CFG3 register
  target/arm: Allow access to SPSR_hyp from hyp mode
  target/arm: Add Cortex-R52 IMPDEF sysregs
  target/arm: The Cortex-R52 has a read-only CBAR
  target/arm: Use new CBAR encoding for all v8 CPUs, not all aarch64 CPUs
  hw/arm/stellaris: Add missing QOM 'SoC' parent
  hw/arm/stellaris: Add missing QOM 'machine' parent
  hw/arm/stellaris: Convert I2C controller to Resettable interface
  hw/arm/stellaris: Convert ADC controller to Resettable interface
  hw/arm/smmuv3: add support for stage 1 access fault
  tests/qtest: Fix GMAC test to run on a machine in upstream QEMU
  target/arm: Don't get MDCR_EL2 in pmu_counter_enabled() before checking ARM_FEATURE_PMU
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
diff --git a/MAINTAINERS b/MAINTAINERS
index a24c2b5..48be2ff 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -819,12 +819,13 @@
 F: hw/pci-host/designware.c
 F: include/hw/pci-host/designware.h
 
-MPS2
+MPS2 / MPS3
 M: Peter Maydell <peter.maydell@linaro.org>
 L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/arm/mps2.c
 F: hw/arm/mps2-tz.c
+F: hw/arm/mps3r.c
 F: hw/misc/mps2-*.c
 F: include/hw/misc/mps2-*.h
 F: hw/arm/armsse.c
diff --git a/configs/devices/arm-softmmu/default.mak b/configs/devices/arm-softmmu/default.mak
index 023faa2..6ee31bc 100644
--- a/configs/devices/arm-softmmu/default.mak
+++ b/configs/devices/arm-softmmu/default.mak
@@ -13,6 +13,7 @@
 # CONFIG_INTEGRATOR=n
 # CONFIG_FSL_IMX31=n
 # CONFIG_MUSICPAL=n
+# CONFIG_MPS3R=n
 # CONFIG_MUSCA=n
 # CONFIG_CHEETAH=n
 # CONFIG_SX1=n
diff --git a/docs/system/arm/mps2.rst b/docs/system/arm/mps2.rst
index 8a75beb..a305935 100644
--- a/docs/system/arm/mps2.rst
+++ b/docs/system/arm/mps2.rst
@@ -1,7 +1,7 @@
-Arm MPS2 and MPS3 boards (``mps2-an385``, ``mps2-an386``, ``mps2-an500``, ``mps2-an505``, ``mps2-an511``, ``mps2-an521``, ``mps3-an524``, ``mps3-an547``)
-=========================================================================================================================================================
+Arm MPS2 and MPS3 boards (``mps2-an385``, ``mps2-an386``, ``mps2-an500``, ``mps2-an505``, ``mps2-an511``, ``mps2-an521``, ``mps3-an524``, ``mps3-an536``, ``mps3-an547``)
+=========================================================================================================================================================================
 
-These board models all use Arm M-profile CPUs.
+These board models use Arm M-profile or R-profile CPUs.
 
 The Arm MPS2, MPS2+ and MPS3 dev boards are FPGA based (the 2+ has a
 bigger FPGA but is otherwise the same as the 2; the 3 has a bigger
@@ -13,6 +13,8 @@
 
 QEMU models the following FPGA images:
 
+FPGA images using M-profile CPUs:
+
 ``mps2-an385``
   Cortex-M3 as documented in Arm Application Note AN385
 ``mps2-an386``
@@ -30,6 +32,11 @@
 ``mps3-an547``
   Cortex-M55 on an MPS3, as documented in Arm Application Note AN547
 
+FPGA images using R-profile CPUs:
+
+``mps3-an536``
+  Dual Cortex-R52 on an MPS3, as documented in Arm Application Note AN536
+
 Differences between QEMU and real hardware:
 
 - AN385/AN386 remapping of low 16K of memory to either ZBT SSRAM1 or to
@@ -45,6 +52,30 @@
   flash, but only as simple ROM, so attempting to rewrite the flash
   from the guest will fail
 - QEMU does not model the USB controller in MPS3 boards
+- AN536 does not support runtime control of CPU reset and halt via
+  the SCC CFG_REG0 register.
+- AN536 does not support enabling or disabling the flash and ATCM
+  interfaces via the SCC CFG_REG1 register.
+- AN536 does not support setting of the initial vector table
+  base address via the SCC CFG_REG6 and CFG_REG7 register config,
+  and does not provide a mechanism for specifying these values at
+  startup, so all guest images must be built to start from TCM
+  (i.e. to expect the interrupt vector base at 0 from reset).
+- AN536 defaults to only creating a single CPU; this is the equivalent
+  of the way the real FPGA image usually runs with the second Cortex-R52
+  held in halt via the initial SCC CFG_REG0 register setting. You can
+  create the second CPU with ``-smp 2``; both CPUs will then start
+  execution immediately on startup.
+
+Note that for the AN536 the first UART is accessible only by
+CPU0, and the second UART is accessible only by CPU1. The
+first UART accessible shared between both CPUs is the third
+UART. Guest software might therefore be built to use either
+the first UART or the third UART; if you don't see any output
+from the UART you are looking at, try one of the others.
+(Even if the AN536 machine is started with a single CPU and so
+no "CPU1-only UART", the UART numbering remains the same,
+with the third UART being the first of the shared ones.)
 
 Machine-specific options
 """"""""""""""""""""""""
diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 980b14d..29abe1d 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -106,6 +106,11 @@
     select PFLASH_CFI01
     select SMC91C111
 
+config MPS3R
+    bool
+    default y
+    depends on TCG && ARM
+
 config MUSCA
     bool
     default y
diff --git a/hw/arm/meson.build b/hw/arm/meson.build
index c401779..a16d347 100644
--- a/hw/arm/meson.build
+++ b/hw/arm/meson.build
@@ -8,6 +8,7 @@
 arm_ss.add(when: 'CONFIG_INTEGRATOR', if_true: files('integratorcp.c'))
 arm_ss.add(when: 'CONFIG_MAINSTONE', if_true: files('mainstone.c'))
 arm_ss.add(when: 'CONFIG_MICROBIT', if_true: files('microbit.c'))
+arm_ss.add(when: 'CONFIG_MPS3R', if_true: files('mps3r.c'))
 arm_ss.add(when: 'CONFIG_MUSICPAL', if_true: files('musicpal.c'))
 arm_ss.add(when: 'CONFIG_NETDUINOPLUS2', if_true: files('netduinoplus2.c'))
 arm_ss.add(when: 'CONFIG_OLIMEX_STM32_H405', if_true: files('olimex-stm32-h405.c'))
diff --git a/hw/arm/mps3r.c b/hw/arm/mps3r.c
new file mode 100644
index 0000000..4d55a65
--- /dev/null
+++ b/hw/arm/mps3r.c
@@ -0,0 +1,640 @@
+/*
+ * Arm MPS3 board emulation for Cortex-R-based FPGA images.
+ * (For M-profile images see mps2.c and mps2tz.c.)
+ *
+ * Copyright (c) 2017 Linaro Limited
+ * Written by Peter Maydell
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License version 2 or
+ *  (at your option) any later version.
+ */
+
+/*
+ * The MPS3 is an FPGA based dev board. This file handles FPGA images
+ * which use the Cortex-R CPUs. We model these separately from the
+ * M-profile images, because on M-profile the FPGA image is based on
+ * a "Subsystem for Embedded" which is similar to an SoC, whereas
+ * the R-profile FPGA images don't have that abstraction layer.
+ *
+ * We model the following FPGA images here:
+ *  "mps3-an536" -- dual Cortex-R52 as documented in Arm Application Note AN536
+ *
+ * Application Note AN536:
+ * https://developer.arm.com/documentation/dai0536/latest/
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
+#include "qapi/qmp/qlist.h"
+#include "exec/address-spaces.h"
+#include "cpu.h"
+#include "sysemu/sysemu.h"
+#include "hw/boards.h"
+#include "hw/or-irq.h"
+#include "hw/qdev-clock.h"
+#include "hw/qdev-properties.h"
+#include "hw/arm/boot.h"
+#include "hw/arm/bsa.h"
+#include "hw/char/cmsdk-apb-uart.h"
+#include "hw/i2c/arm_sbcon_i2c.h"
+#include "hw/intc/arm_gicv3.h"
+#include "hw/misc/mps2-scc.h"
+#include "hw/misc/mps2-fpgaio.h"
+#include "hw/misc/unimp.h"
+#include "hw/net/lan9118.h"
+#include "hw/rtc/pl031.h"
+#include "hw/ssi/pl022.h"
+#include "hw/timer/cmsdk-apb-dualtimer.h"
+#include "hw/watchdog/cmsdk-apb-watchdog.h"
+
+/* Define the layout of RAM and ROM in a board */
+typedef struct RAMInfo {
+    const char *name;
+    hwaddr base;
+    hwaddr size;
+    int mrindex; /* index into rams[]; -1 for the system RAM block */
+    int flags;
+} RAMInfo;
+
+/*
+ * The MPS3 DDR is 3GiB, but on a 32-bit host QEMU doesn't permit
+ * emulation of that much guest RAM, so artificially make it smaller.
+ */
+#if HOST_LONG_BITS == 32
+#define MPS3_DDR_SIZE (1 * GiB)
+#else
+#define MPS3_DDR_SIZE (3 * GiB)
+#endif
+
+/*
+ * Flag values:
+ * IS_MAIN: this is the main machine RAM
+ * IS_ROM: this area is read-only
+ */
+#define IS_MAIN 1
+#define IS_ROM 2
+
+#define MPS3R_RAM_MAX 9
+#define MPS3R_CPU_MAX 2
+#define MPS3R_UART_MAX 4 /* shared UART count */
+
+#define PERIPHBASE 0xf0000000
+#define NUM_SPIS 96
+
+typedef enum MPS3RFPGAType {
+    FPGA_AN536,
+} MPS3RFPGAType;
+
+struct MPS3RMachineClass {
+    MachineClass parent;
+    MPS3RFPGAType fpga_type;
+    const RAMInfo *raminfo;
+    hwaddr loader_start;
+};
+
+struct MPS3RMachineState {
+    MachineState parent;
+    struct arm_boot_info bootinfo;
+    MemoryRegion ram[MPS3R_RAM_MAX];
+    Object *cpu[MPS3R_CPU_MAX];
+    MemoryRegion cpu_sysmem[MPS3R_CPU_MAX];
+    MemoryRegion sysmem_alias[MPS3R_CPU_MAX];
+    MemoryRegion cpu_ram[MPS3R_CPU_MAX];
+    GICv3State gic;
+    /* per-CPU UARTs followed by the shared UARTs */
+    CMSDKAPBUART uart[MPS3R_CPU_MAX + MPS3R_UART_MAX];
+    OrIRQState cpu_uart_oflow[MPS3R_CPU_MAX];
+    OrIRQState uart_oflow;
+    CMSDKAPBWatchdog watchdog;
+    CMSDKAPBDualTimer dualtimer;
+    ArmSbconI2CState i2c[5];
+    PL022State spi[3];
+    MPS2SCC scc;
+    MPS2FPGAIO fpgaio;
+    UnimplementedDeviceState i2s_audio;
+    PL031State rtc;
+    Clock *clk;
+};
+
+#define TYPE_MPS3R_MACHINE "mps3r"
+#define TYPE_MPS3R_AN536_MACHINE MACHINE_TYPE_NAME("mps3-an536")
+
+OBJECT_DECLARE_TYPE(MPS3RMachineState, MPS3RMachineClass, MPS3R_MACHINE)
+
+/*
+ * Main clock frequency CLK in Hz (50MHz). In the image there are also
+ * ACLK, MCLK, GPUCLK and PERIPHCLK at the same frequency; for our
+ * model we just roll them all into one.
+ */
+#define CLK_FRQ 50000000
+
+static const RAMInfo an536_raminfo[] = {
+    {
+        .name = "ATCM",
+        .base = 0x00000000,
+        .size = 0x00008000,
+        .mrindex = 0,
+    }, {
+        /* We model the QSPI flash as simple ROM for now */
+        .name = "QSPI",
+        .base = 0x08000000,
+        .size = 0x00800000,
+        .flags = IS_ROM,
+        .mrindex = 1,
+    }, {
+        .name = "BRAM",
+        .base = 0x10000000,
+        .size = 0x00080000,
+        .mrindex = 2,
+    }, {
+        .name = "DDR",
+        .base = 0x20000000,
+        .size = MPS3_DDR_SIZE,
+        .mrindex = -1,
+    }, {
+        .name = "ATCM0",
+        .base = 0xee000000,
+        .size = 0x00008000,
+        .mrindex = 3,
+    }, {
+        .name = "BTCM0",
+        .base = 0xee100000,
+        .size = 0x00008000,
+        .mrindex = 4,
+    }, {
+        .name = "CTCM0",
+        .base = 0xee200000,
+        .size = 0x00008000,
+        .mrindex = 5,
+    }, {
+        .name = "ATCM1",
+        .base = 0xee400000,
+        .size = 0x00008000,
+        .mrindex = 6,
+    }, {
+        .name = "BTCM1",
+        .base = 0xee500000,
+        .size = 0x00008000,
+        .mrindex = 7,
+    }, {
+        .name = "CTCM1",
+        .base = 0xee600000,
+        .size = 0x00008000,
+        .mrindex = 8,
+    }, {
+        .name = NULL,
+    }
+};
+
+static const int an536_oscclk[] = {
+    24000000, /* 24MHz reference for RTC and timers */
+    50000000, /* 50MHz ACLK */
+    50000000, /* 50MHz MCLK */
+    50000000, /* 50MHz GPUCLK */
+    24576000, /* 24.576MHz AUDCLK */
+    23750000, /* 23.75MHz HDLCDCLK */
+    100000000, /* 100MHz DDR4_REF_CLK */
+};
+
+static MemoryRegion *mr_for_raminfo(MPS3RMachineState *mms,
+                                    const RAMInfo *raminfo)
+{
+    /* Return an initialized MemoryRegion for the RAMInfo. */
+    MemoryRegion *ram;
+
+    if (raminfo->mrindex < 0) {
+        /* Means this RAMInfo is for QEMU's "system memory" */
+        MachineState *machine = MACHINE(mms);
+        assert(!(raminfo->flags & IS_ROM));
+        return machine->ram;
+    }
+
+    assert(raminfo->mrindex < MPS3R_RAM_MAX);
+    ram = &mms->ram[raminfo->mrindex];
+
+    memory_region_init_ram(ram, NULL, raminfo->name,
+                           raminfo->size, &error_fatal);
+    if (raminfo->flags & IS_ROM) {
+        memory_region_set_readonly(ram, true);
+    }
+    return ram;
+}
+
+/*
+ * There is no defined secondary boot protocol for Linux for the AN536,
+ * because real hardware has a restriction that atomic operations between
+ * the two CPUs do not function correctly, and so true SMP is not
+ * possible. Therefore for cases where the user is directly booting
+ * a kernel, we treat the system as essentially uniprocessor, and
+ * put the secondary CPU into power-off state (as if the user on the
+ * real hardware had configured the secondary to be halted via the
+ * SCC config registers).
+ *
+ * Note that the default secondary boot code would not work here anyway
+ * as it assumes a GICv2, and we have a GICv3.
+ */
+static void mps3r_write_secondary_boot(ARMCPU *cpu,
+                                       const struct arm_boot_info *info)
+{
+    /*
+     * Power the secondary CPU off. This means we don't need to write any
+     * boot code into guest memory. Note that the 'cpu' argument to this
+     * function is the primary CPU we passed to arm_load_kernel(), not
+     * the secondary. Loop around all the other CPUs, as the boot.c
+     * code does for the "disable secondaries if PSCI is enabled" case.
+     */
+    for (CPUState *cs = first_cpu; cs; cs = CPU_NEXT(cs)) {
+        if (cs != first_cpu) {
+            object_property_set_bool(OBJECT(cs), "start-powered-off", true,
+                                     &error_abort);
+        }
+    }
+}
+
+static void mps3r_secondary_cpu_reset(ARMCPU *cpu,
+                                      const struct arm_boot_info *info)
+{
+    /* We don't need to do anything here because the CPU will be off */
+}
+
+static void create_gic(MPS3RMachineState *mms, MemoryRegion *sysmem)
+{
+    MachineState *machine = MACHINE(mms);
+    DeviceState *gicdev;
+    QList *redist_region_count;
+
+    object_initialize_child(OBJECT(mms), "gic", &mms->gic, TYPE_ARM_GICV3);
+    gicdev = DEVICE(&mms->gic);
+    qdev_prop_set_uint32(gicdev, "num-cpu", machine->smp.cpus);
+    qdev_prop_set_uint32(gicdev, "num-irq", NUM_SPIS + GIC_INTERNAL);
+    redist_region_count = qlist_new();
+    qlist_append_int(redist_region_count, machine->smp.cpus);
+    qdev_prop_set_array(gicdev, "redist-region-count", redist_region_count);
+    object_property_set_link(OBJECT(&mms->gic), "sysmem",
+                             OBJECT(sysmem), &error_fatal);
+    sysbus_realize(SYS_BUS_DEVICE(&mms->gic), &error_fatal);
+    sysbus_mmio_map(SYS_BUS_DEVICE(&mms->gic), 0, PERIPHBASE);
+    sysbus_mmio_map(SYS_BUS_DEVICE(&mms->gic), 1, PERIPHBASE + 0x100000);
+    /*
+     * Wire the outputs from each CPU's generic timer and the GICv3
+     * maintenance interrupt signal to the appropriate GIC PPI inputs,
+     * and the GIC's IRQ/FIQ/VIRQ/VFIQ interrupt outputs to the CPU's inputs.
+     */
+    for (int i = 0; i < machine->smp.cpus; i++) {
+        DeviceState *cpudev = DEVICE(mms->cpu[i]);
+        SysBusDevice *gicsbd = SYS_BUS_DEVICE(&mms->gic);
+        int intidbase = NUM_SPIS + i * GIC_INTERNAL;
+        int irq;
+        /*
+         * Mapping from the output timer irq lines from the CPU to the
+         * GIC PPI inputs used for this board. This isn't a BSA board,
+         * but it uses the standard convention for the PPI numbers.
+         */
+        const int timer_irq[] = {
+            [GTIMER_PHYS] = ARCH_TIMER_NS_EL1_IRQ,
+            [GTIMER_VIRT] = ARCH_TIMER_VIRT_IRQ,
+            [GTIMER_HYP]  = ARCH_TIMER_NS_EL2_IRQ,
+        };
+
+        for (irq = 0; irq < ARRAY_SIZE(timer_irq); irq++) {
+            qdev_connect_gpio_out(cpudev, irq,
+                                  qdev_get_gpio_in(gicdev,
+                                                   intidbase + timer_irq[irq]));
+        }
+
+        qdev_connect_gpio_out_named(cpudev, "gicv3-maintenance-interrupt", 0,
+                                    qdev_get_gpio_in(gicdev,
+                                                     intidbase + ARCH_GIC_MAINT_IRQ));
+
+        qdev_connect_gpio_out_named(cpudev, "pmu-interrupt", 0,
+                                    qdev_get_gpio_in(gicdev,
+                                                     intidbase + VIRTUAL_PMU_IRQ));
+
+        sysbus_connect_irq(gicsbd, i,
+                           qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
+        sysbus_connect_irq(gicsbd, i + machine->smp.cpus,
+                           qdev_get_gpio_in(cpudev, ARM_CPU_FIQ));
+        sysbus_connect_irq(gicsbd, i + 2 * machine->smp.cpus,
+                           qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
+        sysbus_connect_irq(gicsbd, i + 3 * machine->smp.cpus,
+                           qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
+    }
+}
+
+/*
+ * Create UART uartno, and map it into the MemoryRegion mem at address baseaddr.
+ * The qemu_irq arguments are where we connect the various IRQs from the UART.
+ */
+static void create_uart(MPS3RMachineState *mms, int uartno, MemoryRegion *mem,
+                        hwaddr baseaddr, qemu_irq txirq, qemu_irq rxirq,
+                        qemu_irq txoverirq, qemu_irq rxoverirq,
+                        qemu_irq combirq)
+{
+    g_autofree char *s = g_strdup_printf("uart%d", uartno);
+    SysBusDevice *sbd;
+
+    assert(uartno < ARRAY_SIZE(mms->uart));
+    object_initialize_child(OBJECT(mms), s, &mms->uart[uartno],
+                            TYPE_CMSDK_APB_UART);
+    qdev_prop_set_uint32(DEVICE(&mms->uart[uartno]), "pclk-frq", CLK_FRQ);
+    qdev_prop_set_chr(DEVICE(&mms->uart[uartno]), "chardev", serial_hd(uartno));
+    sbd = SYS_BUS_DEVICE(&mms->uart[uartno]);
+    sysbus_realize(sbd, &error_fatal);
+    memory_region_add_subregion(mem, baseaddr,
+                                sysbus_mmio_get_region(sbd, 0));
+    sysbus_connect_irq(sbd, 0, txirq);
+    sysbus_connect_irq(sbd, 1, rxirq);
+    sysbus_connect_irq(sbd, 2, txoverirq);
+    sysbus_connect_irq(sbd, 3, rxoverirq);
+    sysbus_connect_irq(sbd, 4, combirq);
+}
+
+static void mps3r_common_init(MachineState *machine)
+{
+    MPS3RMachineState *mms = MPS3R_MACHINE(machine);
+    MPS3RMachineClass *mmc = MPS3R_MACHINE_GET_CLASS(mms);
+    MemoryRegion *sysmem = get_system_memory();
+    DeviceState *gicdev;
+    QList *oscclk;
+
+    mms->clk = clock_new(OBJECT(machine), "CLK");
+    clock_set_hz(mms->clk, CLK_FRQ);
+
+    for (const RAMInfo *ri = mmc->raminfo; ri->name; ri++) {
+        MemoryRegion *mr = mr_for_raminfo(mms, ri);
+        memory_region_add_subregion(sysmem, ri->base, mr);
+    }
+
+    assert(machine->smp.cpus <= MPS3R_CPU_MAX);
+    for (int i = 0; i < machine->smp.cpus; i++) {
+        g_autofree char *sysmem_name = g_strdup_printf("cpu-%d-memory", i);
+        g_autofree char *ramname = g_strdup_printf("cpu-%d-memory", i);
+        g_autofree char *alias_name = g_strdup_printf("sysmem-alias-%d", i);
+
+        /*
+         * Each CPU has some private RAM/peripherals, so create the container
+         * which will house those, with the whole-machine system memory being
+         * used where there's no CPU-specific device. Note that we need the
+         * sysmem_alias aliases because we can't put one MR (the original
+         * 'sysmem') into more than one other MR.
+         */
+        memory_region_init(&mms->cpu_sysmem[i], OBJECT(machine),
+                           sysmem_name, UINT64_MAX);
+        memory_region_init_alias(&mms->sysmem_alias[i], OBJECT(machine),
+                                 alias_name, sysmem, 0, UINT64_MAX);
+        memory_region_add_subregion_overlap(&mms->cpu_sysmem[i], 0,
+                                            &mms->sysmem_alias[i], -1);
+
+        mms->cpu[i] = object_new(machine->cpu_type);
+        object_property_set_link(mms->cpu[i], "memory",
+                                 OBJECT(&mms->cpu_sysmem[i]), &error_abort);
+        object_property_set_int(mms->cpu[i], "reset-cbar",
+                                PERIPHBASE, &error_abort);
+        qdev_realize(DEVICE(mms->cpu[i]), NULL, &error_fatal);
+        object_unref(mms->cpu[i]);
+
+        /* Per-CPU RAM */
+        memory_region_init_ram(&mms->cpu_ram[i], NULL, ramname,
+                               0x1000, &error_fatal);
+        memory_region_add_subregion(&mms->cpu_sysmem[i], 0xe7c01000,
+                                    &mms->cpu_ram[i]);
+    }
+
+    create_gic(mms, sysmem);
+    gicdev = DEVICE(&mms->gic);
+
+    /*
+     * UARTs 0 and 1 are per-CPU; their interrupts are wired to
+     * the relevant CPU's PPI 0..3, aka INTID 16..19
+     */
+    for (int i = 0; i < machine->smp.cpus; i++) {
+        int intidbase = NUM_SPIS + i * GIC_INTERNAL;
+        g_autofree char *s = g_strdup_printf("cpu-uart-oflow-orgate%d", i);
+        DeviceState *orgate;
+
+        /* The two overflow IRQs from the UART are ORed together into PPI 3 */
+        object_initialize_child(OBJECT(mms), s, &mms->cpu_uart_oflow[i],
+                                TYPE_OR_IRQ);
+        orgate = DEVICE(&mms->cpu_uart_oflow[i]);
+        qdev_prop_set_uint32(orgate, "num-lines", 2);
+        qdev_realize(orgate, NULL, &error_fatal);
+        qdev_connect_gpio_out(orgate, 0,
+                              qdev_get_gpio_in(gicdev, intidbase + 19));
+
+        create_uart(mms, i, &mms->cpu_sysmem[i], 0xe7c00000,
+                    qdev_get_gpio_in(gicdev, intidbase + 17), /* tx */
+                    qdev_get_gpio_in(gicdev, intidbase + 16), /* rx */
+                    qdev_get_gpio_in(orgate, 0), /* txover */
+                    qdev_get_gpio_in(orgate, 1), /* rxover */
+                    qdev_get_gpio_in(gicdev, intidbase + 18) /* combined */);
+    }
+    /*
+     * UARTs 2 to 5 are whole-system; all overflow IRQs are ORed
+     * together into IRQ 17
+     */
+    object_initialize_child(OBJECT(mms), "uart-oflow-orgate",
+                            &mms->uart_oflow, TYPE_OR_IRQ);
+    qdev_prop_set_uint32(DEVICE(&mms->uart_oflow), "num-lines",
+                         MPS3R_UART_MAX * 2);
+    qdev_realize(DEVICE(&mms->uart_oflow), NULL, &error_fatal);
+    qdev_connect_gpio_out(DEVICE(&mms->uart_oflow), 0,
+                          qdev_get_gpio_in(gicdev, 17));
+
+    for (int i = 0; i < MPS3R_UART_MAX; i++) {
+        hwaddr baseaddr = 0xe0205000 + i * 0x1000;
+        int rxirq = 5 + i * 2, txirq = 6 + i * 2, combirq = 13 + i;
+
+        create_uart(mms, i + MPS3R_CPU_MAX, sysmem, baseaddr,
+                    qdev_get_gpio_in(gicdev, txirq),
+                    qdev_get_gpio_in(gicdev, rxirq),
+                    qdev_get_gpio_in(DEVICE(&mms->uart_oflow), i * 2),
+                    qdev_get_gpio_in(DEVICE(&mms->uart_oflow), i * 2 + 1),
+                    qdev_get_gpio_in(gicdev, combirq));
+    }
+
+    for (int i = 0; i < 4; i++) {
+        /* CMSDK GPIO controllers */
+        g_autofree char *s = g_strdup_printf("gpio%d", i);
+        create_unimplemented_device(s, 0xe0000000 + i * 0x1000, 0x1000);
+    }
+
+    object_initialize_child(OBJECT(mms), "watchdog", &mms->watchdog,
+                            TYPE_CMSDK_APB_WATCHDOG);
+    qdev_connect_clock_in(DEVICE(&mms->watchdog), "WDOGCLK", mms->clk);
+    sysbus_realize(SYS_BUS_DEVICE(&mms->watchdog), &error_fatal);
+    sysbus_connect_irq(SYS_BUS_DEVICE(&mms->watchdog), 0,
+                       qdev_get_gpio_in(gicdev, 0));
+    sysbus_mmio_map(SYS_BUS_DEVICE(&mms->watchdog), 0, 0xe0100000);
+
+    object_initialize_child(OBJECT(mms), "dualtimer", &mms->dualtimer,
+                            TYPE_CMSDK_APB_DUALTIMER);
+    qdev_connect_clock_in(DEVICE(&mms->dualtimer), "TIMCLK", mms->clk);
+    sysbus_realize(SYS_BUS_DEVICE(&mms->dualtimer), &error_fatal);
+    sysbus_connect_irq(SYS_BUS_DEVICE(&mms->dualtimer), 0,
+                       qdev_get_gpio_in(gicdev, 3));
+    sysbus_connect_irq(SYS_BUS_DEVICE(&mms->dualtimer), 1,
+                       qdev_get_gpio_in(gicdev, 1));
+    sysbus_connect_irq(SYS_BUS_DEVICE(&mms->dualtimer), 2,
+                       qdev_get_gpio_in(gicdev, 2));
+    sysbus_mmio_map(SYS_BUS_DEVICE(&mms->dualtimer), 0, 0xe0101000);
+
+    for (int i = 0; i < ARRAY_SIZE(mms->i2c); i++) {
+        static const hwaddr i2cbase[] = {0xe0102000,    /* Touch */
+                                         0xe0103000,    /* Audio */
+                                         0xe0107000,    /* Shield0 */
+                                         0xe0108000,    /* Shield1 */
+                                         0xe0109000};   /* DDR4 EEPROM */
+        g_autofree char *s = g_strdup_printf("i2c%d", i);
+
+        object_initialize_child(OBJECT(mms), s, &mms->i2c[i],
+                                TYPE_ARM_SBCON_I2C);
+        sysbus_realize(SYS_BUS_DEVICE(&mms->i2c[i]), &error_fatal);
+        sysbus_mmio_map(SYS_BUS_DEVICE(&mms->i2c[i]), 0, i2cbase[i]);
+        if (i != 2 && i != 3) {
+            /*
+             * internal-only bus: mark it full to avoid user-created
+             * i2c devices being plugged into it.
+             */
+            qbus_mark_full(qdev_get_child_bus(DEVICE(&mms->i2c[i]), "i2c"));
+        }
+    }
+
+    for (int i = 0; i < ARRAY_SIZE(mms->spi); i++) {
+        g_autofree char *s = g_strdup_printf("spi%d", i);
+        hwaddr baseaddr = 0xe0104000 + i * 0x1000;
+
+        object_initialize_child(OBJECT(mms), s, &mms->spi[i], TYPE_PL022);
+        sysbus_realize(SYS_BUS_DEVICE(&mms->spi[i]), &error_fatal);
+        sysbus_mmio_map(SYS_BUS_DEVICE(&mms->spi[i]), 0, baseaddr);
+        sysbus_connect_irq(SYS_BUS_DEVICE(&mms->spi[i]), 0,
+                           qdev_get_gpio_in(gicdev, 22 + i));
+    }
+
+    object_initialize_child(OBJECT(mms), "scc", &mms->scc, TYPE_MPS2_SCC);
+    qdev_prop_set_uint32(DEVICE(&mms->scc), "scc-cfg0", 0);
+    qdev_prop_set_uint32(DEVICE(&mms->scc), "scc-cfg4", 0x2);
+    qdev_prop_set_uint32(DEVICE(&mms->scc), "scc-aid", 0x00200008);
+    qdev_prop_set_uint32(DEVICE(&mms->scc), "scc-id", 0x41055360);
+    oscclk = qlist_new();
+    for (int i = 0; i < ARRAY_SIZE(an536_oscclk); i++) {
+        qlist_append_int(oscclk, an536_oscclk[i]);
+    }
+    qdev_prop_set_array(DEVICE(&mms->scc), "oscclk", oscclk);
+    sysbus_realize(SYS_BUS_DEVICE(&mms->scc), &error_fatal);
+    sysbus_mmio_map(SYS_BUS_DEVICE(&mms->scc), 0, 0xe0200000);
+
+    create_unimplemented_device("i2s-audio", 0xe0201000, 0x1000);
+
+    object_initialize_child(OBJECT(mms), "fpgaio", &mms->fpgaio,
+                            TYPE_MPS2_FPGAIO);
+    qdev_prop_set_uint32(DEVICE(&mms->fpgaio), "prescale-clk", an536_oscclk[1]);
+    qdev_prop_set_uint32(DEVICE(&mms->fpgaio), "num-leds", 10);
+    qdev_prop_set_bit(DEVICE(&mms->fpgaio), "has-switches", true);
+    qdev_prop_set_bit(DEVICE(&mms->fpgaio), "has-dbgctrl", false);
+    sysbus_realize(SYS_BUS_DEVICE(&mms->fpgaio), &error_fatal);
+    sysbus_mmio_map(SYS_BUS_DEVICE(&mms->fpgaio), 0, 0xe0202000);
+
+    create_unimplemented_device("clcd", 0xe0209000, 0x1000);
+
+    object_initialize_child(OBJECT(mms), "rtc", &mms->rtc, TYPE_PL031);
+    sysbus_realize(SYS_BUS_DEVICE(&mms->rtc), &error_fatal);
+    sysbus_mmio_map(SYS_BUS_DEVICE(&mms->rtc), 0, 0xe020a000);
+    sysbus_connect_irq(SYS_BUS_DEVICE(&mms->rtc), 0,
+                       qdev_get_gpio_in(gicdev, 4));
+
+    /*
+     * In hardware this is a LAN9220; the LAN9118 is software compatible
+     * except that it doesn't support the checksum-offload feature.
+     */
+    lan9118_init(0xe0300000,
+                 qdev_get_gpio_in(gicdev, 18));
+
+    create_unimplemented_device("usb", 0xe0301000, 0x1000);
+    create_unimplemented_device("qspi-write-config", 0xe0600000, 0x1000);
+
+    mms->bootinfo.ram_size = machine->ram_size;
+    mms->bootinfo.board_id = -1;
+    mms->bootinfo.loader_start = mmc->loader_start;
+    mms->bootinfo.write_secondary_boot = mps3r_write_secondary_boot;
+    mms->bootinfo.secondary_cpu_reset_hook = mps3r_secondary_cpu_reset;
+    arm_load_kernel(ARM_CPU(mms->cpu[0]), machine, &mms->bootinfo);
+}
+
+static void mps3r_set_default_ram_info(MPS3RMachineClass *mmc)
+{
+    /*
+     * Set mc->default_ram_size and default_ram_id from the
+     * information in mmc->raminfo.
+     */
+    MachineClass *mc = MACHINE_CLASS(mmc);
+    const RAMInfo *p;
+
+    for (p = mmc->raminfo; p->name; p++) {
+        if (p->mrindex < 0) {
+            /* Found the entry for "system memory" */
+            mc->default_ram_size = p->size;
+            mc->default_ram_id = p->name;
+            mmc->loader_start = p->base;
+            return;
+        }
+    }
+    g_assert_not_reached();
+}
+
+static void mps3r_class_init(ObjectClass *oc, void *data)
+{
+    MachineClass *mc = MACHINE_CLASS(oc);
+
+    mc->init = mps3r_common_init;
+}
+
+static void mps3r_an536_class_init(ObjectClass *oc, void *data)
+{
+    MachineClass *mc = MACHINE_CLASS(oc);
+    MPS3RMachineClass *mmc = MPS3R_MACHINE_CLASS(oc);
+    static const char * const valid_cpu_types[] = {
+        ARM_CPU_TYPE_NAME("cortex-r52"),
+        NULL
+    };
+
+    mc->desc = "ARM MPS3 with AN536 FPGA image for Cortex-R52";
+    /*
+     * In the real FPGA image there are always two cores, but the standard
+     * initial setting for the SCC SYSCON 0x000 register is 0x21, meaning
+     * that the second core is held in reset and halted. Many images built for
+     * the board do not expect the second core to run at startup (especially
+     * since on the real FPGA image it is not possible to use LDREX/STREX
+     * in RAM between the two cores, so a true SMP setup isn't supported).
+     *
+     * As QEMU's equivalent of this, we support both -smp 1 and -smp 2,
+     * with the default being -smp 1. This seems a more intuitive UI for
+     * QEMU users than, for instance, having a machine property to allow
+     * the user to set the initial value of the SYSCON 0x000 register.
+     */
+    mc->default_cpus = 1;
+    mc->min_cpus = 1;
+    mc->max_cpus = 2;
+    mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-r52");
+    mc->valid_cpu_types = valid_cpu_types;
+    mmc->raminfo = an536_raminfo;
+    mps3r_set_default_ram_info(mmc);
+}
+
+static const TypeInfo mps3r_machine_types[] = {
+    {
+        .name = TYPE_MPS3R_MACHINE,
+        .parent = TYPE_MACHINE,
+        .abstract = true,
+        .instance_size = sizeof(MPS3RMachineState),
+        .class_size = sizeof(MPS3RMachineClass),
+        .class_init = mps3r_class_init,
+    }, {
+        .name = TYPE_MPS3R_AN536_MACHINE,
+        .parent = TYPE_MPS3R_MACHINE,
+        .class_init = mps3r_an536_class_init,
+    },
+};
+
+DEFINE_TYPES(mps3r_machine_types);
diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c
index ff3ecde..cc68b5d 100644
--- a/hw/arm/npcm7xx.c
+++ b/hw/arm/npcm7xx.c
@@ -710,6 +710,7 @@
     for (i = 0; i < ARRAY_SIZE(s->gmac); i++) {
         SysBusDevice *sbd = SYS_BUS_DEVICE(&s->gmac[i]);
 
+        qemu_configure_nic_device(DEVICE(sbd), false, NULL);
         /*
          * The device exists regardless of whether it's connected to a QEMU
          * netdev backend. So always instantiate it even if there is no
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index f58261b..4caedb4 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -364,6 +364,17 @@
                                      pte_addr, pte, iova, gpa,
                                      block_size >> 20);
         }
+
+        /*
+         * QEMU does not currently implement HTTU, so if AFFD and PTE.AF
+         * are 0 we take an Access flag fault. (5.4. Context Descriptor)
+         * An Access flag fault takes priority over a Permission fault.
+         */
+        if (!PTE_AF(pte) && !cfg->affd) {
+            info->type = SMMU_PTW_ERR_ACCESS;
+            goto error;
+        }
+
         ap = PTE_AP(pte);
         if (is_permission_fault(ap, perm)) {
             info->type = SMMU_PTW_ERR_PERMISSION;
diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index e987bc4..e4dd11e 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -624,6 +624,7 @@
 #define CD_EPD(x, sel)   extract32((x)->word[0], (16 * (sel)) + 14, 1)
 #define CD_ENDI(x)       extract32((x)->word[0], 15, 1)
 #define CD_IPS(x)        extract32((x)->word[1], 0 , 3)
+#define CD_AFFD(x)       extract32((x)->word[1], 3 , 1)
 #define CD_TBI(x)        extract32((x)->word[1], 6 , 2)
 #define CD_HD(x)         extract32((x)->word[1], 10 , 1)
 #define CD_HA(x)         extract32((x)->word[1], 11 , 1)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index b3d8642..9eb56a7 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -684,6 +684,7 @@
     cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas);
     cfg->tbi = CD_TBI(cd);
     cfg->asid = CD_ASID(cd);
+    cfg->affd = CD_AFFD(cd);
 
     trace_smmuv3_decode_cd(cfg->oas);
 
diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
index 34c5a86..a2f998b 100644
--- a/hw/arm/stellaris.c
+++ b/hw/arm/stellaris.c
@@ -462,7 +462,10 @@
     s->sysclk = qdev_init_clock_out(DEVICE(s), "SYSCLK");
 }
 
-/* I2C controller.  */
+/*
+ * I2C controller.
+ * ??? For now we only implement the master interface.
+ */
 
 #define TYPE_STELLARIS_I2C "stellaris-i2c"
 OBJECT_DECLARE_SIMPLE_TYPE(stellaris_i2c_state, STELLARIS_I2C)
@@ -607,10 +610,17 @@
     stellaris_i2c_update(s);
 }
 
-static void stellaris_i2c_reset(stellaris_i2c_state *s)
+static void stellaris_i2c_reset_enter(Object *obj, ResetType type)
 {
+    stellaris_i2c_state *s = STELLARIS_I2C(obj);
+
     if (s->mcs & STELLARIS_I2C_MCS_BUSBSY)
         i2c_end_transfer(s->bus);
+}
+
+static void stellaris_i2c_reset_hold(Object *obj)
+{
+    stellaris_i2c_state *s = STELLARIS_I2C(obj);
 
     s->msa = 0;
     s->mcs = 0;
@@ -619,6 +629,12 @@
     s->mimr = 0;
     s->mris = 0;
     s->mcr = 0;
+}
+
+static void stellaris_i2c_reset_exit(Object *obj)
+{
+    stellaris_i2c_state *s = STELLARIS_I2C(obj);
+
     stellaris_i2c_update(s);
 }
 
@@ -658,8 +674,6 @@
     memory_region_init_io(&s->iomem, obj, &stellaris_i2c_ops, s,
                           "i2c", 0x1000);
     sysbus_init_mmio(sbd, &s->iomem);
-    /* ??? For now we only implement the master interface.  */
-    stellaris_i2c_reset(s);
 }
 
 /* Analogue to Digital Converter.  This is only partially implemented,
@@ -773,8 +787,9 @@
     }
 }
 
-static void stellaris_adc_reset(StellarisADCState *s)
+static void stellaris_adc_reset_hold(Object *obj)
 {
+    StellarisADCState *s = STELLARIS_ADC(obj);
     int n;
 
     for (n = 0; n < 4; n++) {
@@ -946,7 +961,6 @@
     memory_region_init_io(&s->iomem, obj, &stellaris_adc_ops, s,
                           "adc", 0x1000);
     sysbus_init_mmio(sbd, &s->iomem);
-    stellaris_adc_reset(s);
     qdev_init_gpio_in(dev, stellaris_adc_trigger, 1);
 }
 
@@ -1017,6 +1031,7 @@
      * 400fe000 system control
      */
 
+    Object *soc_container;
     DeviceState *gpio_dev[7], *nvic;
     qemu_irq gpio_in[7][8];
     qemu_irq gpio_out[7][8];
@@ -1038,6 +1053,9 @@
     flash_size = (((board->dc0 & 0xffff) + 1) << 1) * 1024;
     sram_size = ((board->dc0 >> 18) + 1) * 1024;
 
+    soc_container = object_new("container");
+    object_property_add_child(OBJECT(ms), "soc", soc_container);
+
     /* Flash programming is done via the SCU, so pretend it is ROM.  */
     memory_region_init_rom(flash, NULL, "stellaris.flash", flash_size,
                            &error_fatal);
@@ -1052,6 +1070,7 @@
      * need its sysclk output.
      */
     ssys_dev = qdev_new(TYPE_STELLARIS_SYS);
+    object_property_add_child(soc_container, "sys", OBJECT(ssys_dev));
 
     /*
      * Most devices come preprogrammed with a MAC address in the user data.
@@ -1078,6 +1097,7 @@
     sysbus_realize_and_unref(SYS_BUS_DEVICE(ssys_dev), &error_fatal);
 
     nvic = qdev_new(TYPE_ARMV7M);
+    object_property_add_child(soc_container, "v7m", OBJECT(nvic));
     qdev_prop_set_uint32(nvic, "num-irq", NUM_IRQ_LINES);
     qdev_prop_set_uint8(nvic, "num-prio-bits", NUM_PRIO_BITS);
     qdev_prop_set_string(nvic, "cpu-type", ms->cpu_type);
@@ -1111,6 +1131,7 @@
 
             dev = qdev_new(TYPE_STELLARIS_GPTM);
             sbd = SYS_BUS_DEVICE(dev);
+            object_property_add_child(soc_container, "gptm[*]", OBJECT(dev));
             qdev_connect_clock_in(dev, "clk",
                                   qdev_get_clock_out(ssys_dev, "SYSCLK"));
             sysbus_realize_and_unref(sbd, &error_fatal);
@@ -1124,7 +1145,7 @@
 
     if (board->dc1 & (1 << 3)) { /* watchdog present */
         dev = qdev_new(TYPE_LUMINARY_WATCHDOG);
-
+        object_property_add_child(soc_container, "wdg", OBJECT(dev));
         qdev_connect_clock_in(dev, "WDOGCLK",
                               qdev_get_clock_out(ssys_dev, "SYSCLK"));
 
@@ -1164,6 +1185,7 @@
             SysBusDevice *sbd;
 
             dev = qdev_new("pl011_luminary");
+            object_property_add_child(soc_container, "uart[*]", OBJECT(dev));
             sbd = SYS_BUS_DEVICE(dev);
             qdev_prop_set_chr(dev, "chardev", serial_hd(i));
             sysbus_realize_and_unref(sbd, &error_fatal);
@@ -1257,10 +1279,13 @@
                                    &error_fatal);
 
             ssddev = qdev_new("ssd0323");
+            object_property_add_child(OBJECT(ms), "oled", OBJECT(ssddev));
             qdev_prop_set_uint8(ssddev, "cs", 1);
             qdev_realize_and_unref(ssddev, bus, &error_fatal);
 
             gpio_d_splitter = qdev_new(TYPE_SPLIT_IRQ);
+            object_property_add_child(OBJECT(ms), "splitter",
+                                      OBJECT(gpio_d_splitter));
             qdev_prop_set_uint32(gpio_d_splitter, "num-lines", 2);
             qdev_realize_and_unref(gpio_d_splitter, NULL, &error_fatal);
             qdev_connect_gpio_out(
@@ -1281,6 +1306,7 @@
         DeviceState *enet;
 
         enet = qdev_new("stellaris_enet");
+        object_property_add_child(soc_container, "enet", OBJECT(enet));
         if (nd) {
             qdev_set_nic_properties(enet, nd);
         } else {
@@ -1300,6 +1326,7 @@
         DeviceState *gpad;
 
         gpad = qdev_new(TYPE_STELLARIS_GAMEPAD);
+        object_property_add_child(OBJECT(ms), "gamepad", OBJECT(gpad));
         for (i = 0; i < ARRAY_SIZE(gpad_keycode); i++) {
             qlist_append_int(gpad_keycode_list, gpad_keycode[i]);
         }
@@ -1396,7 +1423,11 @@
 static void stellaris_i2c_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
+    ResettableClass *rc = RESETTABLE_CLASS(klass);
 
+    rc->phases.enter = stellaris_i2c_reset_enter;
+    rc->phases.hold = stellaris_i2c_reset_hold;
+    rc->phases.exit = stellaris_i2c_reset_exit;
     dc->vmsd = &vmstate_stellaris_i2c;
 }
 
@@ -1411,7 +1442,9 @@
 static void stellaris_adc_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
+    ResettableClass *rc = RESETTABLE_CLASS(klass);
 
+    rc->phases.hold = stellaris_adc_reset_hold;
     dc->vmsd = &vmstate_stellaris_adc;
 }
 
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 48febde..8414122 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -533,8 +533,8 @@
 }
 
 /*
- * ACPI spec, Revision 5.1
- * 5.2.24 Generic Timer Description Table (GTDT)
+ * ACPI spec, Revision 6.5
+ * 5.2.25 Generic Timer Description Table (GTDT)
  */
 static void
 build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
@@ -548,7 +548,7 @@
     uint32_t irqflags = vmc->claim_edge_triggered_timers ?
         1 : /* Interrupt is Edge triggered */
         0;  /* Interrupt is Level triggered  */
-    AcpiTable table = { .sig = "GTDT", .rev = 2, .oem_id = vms->oem_id,
+    AcpiTable table = { .sig = "GTDT", .rev = 3, .oem_id = vms->oem_id,
                         .oem_table_id = vms->oem_table_id };
 
     acpi_table_begin(&table, table_data);
@@ -584,7 +584,15 @@
     build_append_int_noprefix(table_data, 0, 4);
     /* Platform Timer Offset */
     build_append_int_noprefix(table_data, 0, 4);
-
+    if (vms->ns_el2_virt_timer_irq) {
+        /* Virtual EL2 Timer GSIV */
+        build_append_int_noprefix(table_data, ARCH_TIMER_NS_EL2_VIRT_IRQ, 4);
+        /* Virtual EL2 Timer Flags */
+        build_append_int_noprefix(table_data, irqflags, 4);
+    } else {
+        build_append_int_noprefix(table_data, 0, 4);
+        build_append_int_noprefix(table_data, 0, 4);
+    }
     acpi_table_end(linker, &table);
 }
 
@@ -771,10 +779,10 @@
 static void build_fadt_rev6(GArray *table_data, BIOSLinker *linker,
                             VirtMachineState *vms, unsigned dsdt_tbl_offset)
 {
-    /* ACPI v6.0 */
+    /* ACPI v6.3 */
     AcpiFadtData fadt = {
         .rev = 6,
-        .minor_ver = 0,
+        .minor_ver = 3,
         .flags = 1 << ACPI_FADT_F_HW_REDUCED_ACPI,
         .xdsdt_tbl_offset = &dsdt_tbl_offset,
     };
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 368c2a4..0af1943 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -221,6 +221,20 @@
     qemu_fdt_setprop(ms->fdt, node, "rng-seed", seed.rng, sizeof(seed.rng));
 }
 
+/*
+ * The CPU object always exposes the NS EL2 virt timer IRQ line,
+ * but we don't want to advertise it to the guest in the dtb or ACPI
+ * table unless it's really going to do something.
+ */
+static bool ns_el2_virt_timer_present(void)
+{
+    ARMCPU *cpu = ARM_CPU(qemu_get_cpu(0));
+    CPUARMState *env = &cpu->env;
+
+    return arm_feature(env, ARM_FEATURE_AARCH64) &&
+        arm_feature(env, ARM_FEATURE_EL2) && cpu_isar_feature(aa64_vh, cpu);
+}
+
 static void create_fdt(VirtMachineState *vms)
 {
     MachineState *ms = MACHINE(vms);
@@ -338,15 +352,29 @@
                                 "arm,armv7-timer");
     }
     qemu_fdt_setprop(ms->fdt, "/timer", "always-on", NULL, 0);
-    qemu_fdt_setprop_cells(ms->fdt, "/timer", "interrupts",
-                           GIC_FDT_IRQ_TYPE_PPI,
-                           INTID_TO_PPI(ARCH_TIMER_S_EL1_IRQ), irqflags,
-                           GIC_FDT_IRQ_TYPE_PPI,
-                           INTID_TO_PPI(ARCH_TIMER_NS_EL1_IRQ), irqflags,
-                           GIC_FDT_IRQ_TYPE_PPI,
-                           INTID_TO_PPI(ARCH_TIMER_VIRT_IRQ), irqflags,
-                           GIC_FDT_IRQ_TYPE_PPI,
-                           INTID_TO_PPI(ARCH_TIMER_NS_EL2_IRQ), irqflags);
+    if (vms->ns_el2_virt_timer_irq) {
+        qemu_fdt_setprop_cells(ms->fdt, "/timer", "interrupts",
+                               GIC_FDT_IRQ_TYPE_PPI,
+                               INTID_TO_PPI(ARCH_TIMER_S_EL1_IRQ), irqflags,
+                               GIC_FDT_IRQ_TYPE_PPI,
+                               INTID_TO_PPI(ARCH_TIMER_NS_EL1_IRQ), irqflags,
+                               GIC_FDT_IRQ_TYPE_PPI,
+                               INTID_TO_PPI(ARCH_TIMER_VIRT_IRQ), irqflags,
+                               GIC_FDT_IRQ_TYPE_PPI,
+                               INTID_TO_PPI(ARCH_TIMER_NS_EL2_IRQ), irqflags,
+                               GIC_FDT_IRQ_TYPE_PPI,
+                               INTID_TO_PPI(ARCH_TIMER_NS_EL2_VIRT_IRQ), irqflags);
+    } else {
+        qemu_fdt_setprop_cells(ms->fdt, "/timer", "interrupts",
+                               GIC_FDT_IRQ_TYPE_PPI,
+                               INTID_TO_PPI(ARCH_TIMER_S_EL1_IRQ), irqflags,
+                               GIC_FDT_IRQ_TYPE_PPI,
+                               INTID_TO_PPI(ARCH_TIMER_NS_EL1_IRQ), irqflags,
+                               GIC_FDT_IRQ_TYPE_PPI,
+                               INTID_TO_PPI(ARCH_TIMER_VIRT_IRQ), irqflags,
+                               GIC_FDT_IRQ_TYPE_PPI,
+                               INTID_TO_PPI(ARCH_TIMER_NS_EL2_IRQ), irqflags);
+    }
 }
 
 static void fdt_add_cpu_nodes(const VirtMachineState *vms)
@@ -789,6 +817,7 @@
             [GTIMER_VIRT] = ARCH_TIMER_VIRT_IRQ,
             [GTIMER_HYP]  = ARCH_TIMER_NS_EL2_IRQ,
             [GTIMER_SEC]  = ARCH_TIMER_S_EL1_IRQ,
+            [GTIMER_HYPVIRT] = ARCH_TIMER_NS_EL2_VIRT_IRQ,
         };
 
         for (unsigned irq = 0; irq < ARRAY_SIZE(timer_irq); irq++) {
@@ -2222,6 +2251,11 @@
         qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
         object_unref(cpuobj);
     }
+
+    /* Now we've created the CPUs we can see if they have the hypvirt timer */
+    vms->ns_el2_virt_timer_irq = ns_el2_virt_timer_present() &&
+        !vmc->no_ns_el2_virt_timer_irq;
+
     fdt_add_timer_nodes(vms);
     fdt_add_cpu_nodes(vms);
 
@@ -3179,8 +3213,16 @@
 
 static void virt_machine_8_2_options(MachineClass *mc)
 {
+    VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
+
     virt_machine_9_0_options(mc);
     compat_props_add(mc->compat_props, hw_compat_8_2, hw_compat_8_2_len);
+    /*
+     * Don't expose NS_EL2_VIRT timer IRQ in DTB on ACPI on 8.2 and
+     * earlier machines. (Exposing it tickles a bug in older EDK2
+     * guest BIOS binaries.)
+     */
+    vmc->no_ns_el2_virt_timer_irq = true;
 }
 DEFINE_VIRT_MACHINE(8, 2)
 
diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index a41a118..fc3abcb 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -243,6 +243,8 @@
     sysbus_mmio_map(busdev, 0, MPCORE_PERIPHBASE);
     sysbus_connect_irq(busdev, 0,
                        qdev_get_gpio_in(DEVICE(cpu), ARM_CPU_IRQ));
+    sysbus_connect_irq(busdev, 1,
+                       qdev_get_gpio_in(DEVICE(cpu), ARM_CPU_FIQ));
 
     for (n = 0; n < 64; n++) {
         pic[n] = qdev_get_gpio_in(dev, n);
diff --git a/hw/block/tc58128.c b/hw/block/tc58128.c
index 6944cf5..0984e37 100644
--- a/hw/block/tc58128.c
+++ b/hw/block/tc58128.c
@@ -202,7 +202,9 @@
 
 int tc58128_init(struct SH7750State *s, const char *zone1, const char *zone2)
 {
-    warn_report_once("The TC58128 flash device is deprecated");
+    if (!qtest_enabled()) {
+        warn_report_once("The TC58128 flash device is deprecated");
+    }
     init_dev(&tc58128_devs[0], zone1);
     init_dev(&tc58128_devs[1], zone2);
     return sh7750_register_io_device(s, &tc58128);
diff --git a/hw/misc/mps2-scc.c b/hw/misc/mps2-scc.c
index 6cfb5ff..18be741 100644
--- a/hw/misc/mps2-scc.c
+++ b/hw/misc/mps2-scc.c
@@ -37,6 +37,7 @@
 REG32(CFG4, 0x10)
 REG32(CFG5, 0x14)
 REG32(CFG6, 0x18)
+REG32(CFG7, 0x1c)
 REG32(CFGDATA_RTN, 0xa0)
 REG32(CFGDATA_OUT, 0xa4)
 REG32(CFGCTRL, 0xa8)
@@ -59,6 +60,51 @@
     return extract32(s->id, 4, 8);
 }
 
+/* Is CFG_REG2 present? */
+static bool have_cfg2(MPS2SCC *s)
+{
+    return scc_partno(s) == 0x524 || scc_partno(s) == 0x547 ||
+        scc_partno(s) == 0x536;
+}
+
+/* Is CFG_REG3 present? */
+static bool have_cfg3(MPS2SCC *s)
+{
+    return scc_partno(s) != 0x524 && scc_partno(s) != 0x547 &&
+        scc_partno(s) != 0x536;
+}
+
+/* Is CFG_REG5 present? */
+static bool have_cfg5(MPS2SCC *s)
+{
+    return scc_partno(s) == 0x524 || scc_partno(s) == 0x547 ||
+        scc_partno(s) == 0x536;
+}
+
+/* Is CFG_REG6 present? */
+static bool have_cfg6(MPS2SCC *s)
+{
+    return scc_partno(s) == 0x524 || scc_partno(s) == 0x536;
+}
+
+/* Is CFG_REG7 present? */
+static bool have_cfg7(MPS2SCC *s)
+{
+    return scc_partno(s) == 0x536;
+}
+
+/* Does CFG_REG0 drive the 'remap' GPIO output? */
+static bool cfg0_is_remap(MPS2SCC *s)
+{
+    return scc_partno(s) != 0x536;
+}
+
+/* Is CFG_REG1 driving a set of LEDs? */
+static bool cfg1_is_leds(MPS2SCC *s)
+{
+    return scc_partno(s) != 0x536;
+}
+
 /* Handle a write via the SYS_CFG channel to the specified function/device.
  * Return false on error (reported to guest via SYS_CFGCTRL ERROR bit).
  */
@@ -111,19 +157,25 @@
         r = s->cfg1;
         break;
     case A_CFG2:
-        if (scc_partno(s) != 0x524 && scc_partno(s) != 0x547) {
-            /* CFG2 reserved on other boards */
+        if (!have_cfg2(s)) {
             goto bad_offset;
         }
         r = s->cfg2;
         break;
     case A_CFG3:
-        if (scc_partno(s) == 0x524 && scc_partno(s) == 0x547) {
-            /* CFG3 reserved on AN524 */
+        if (!have_cfg3(s)) {
             goto bad_offset;
         }
-        /* These are user-settable DIP switches on the board. We don't
+        /*
+         * These are user-settable DIP switches on the board. We don't
          * model that, so just return zeroes.
+         *
+         * TODO: for AN536 this is MCC_MSB_ADDR "additional MCC addressing
+         * bits". These change which part of the DDR4 the motherboard
+         * configuration controller can see in its memory map (see the
+         * appnote section 2.4). QEMU doesn't model the MCC at all, so these
+         * bits are not interesting to us; read-as-zero is as good as anything
+         * else.
          */
         r = 0;
         break;
@@ -131,19 +183,23 @@
         r = s->cfg4;
         break;
     case A_CFG5:
-        if (scc_partno(s) != 0x524 && scc_partno(s) != 0x547) {
-            /* CFG5 reserved on other boards */
+        if (!have_cfg5(s)) {
             goto bad_offset;
         }
         r = s->cfg5;
         break;
     case A_CFG6:
-        if (scc_partno(s) != 0x524) {
-            /* CFG6 reserved on other boards */
+        if (!have_cfg6(s)) {
             goto bad_offset;
         }
         r = s->cfg6;
         break;
+    case A_CFG7:
+        if (!have_cfg7(s)) {
+            goto bad_offset;
+        }
+        r = s->cfg7;
+        break;
     case A_CFGDATA_RTN:
         r = s->cfgdata_rtn;
         break;
@@ -191,38 +247,58 @@
          * we always reflect bit 0 in the 'remap' GPIO output line,
          * and let the board wire it up or not as it chooses.
          * TODO on some boards bit 1 is CPU_WAIT.
+         *
+         * TODO: on the AN536 this register controls reset and halt
+         * for both CPUs. For the moment we don't implement this, so the
+         * register just reads as written.
          */
         s->cfg0 = value;
-        qemu_set_irq(s->remap, s->cfg0 & 1);
+        if (cfg0_is_remap(s)) {
+            qemu_set_irq(s->remap, s->cfg0 & 1);
+        }
         break;
     case A_CFG1:
         s->cfg1 = value;
-        for (size_t i = 0; i < ARRAY_SIZE(s->led); i++) {
-            led_set_state(s->led[i], extract32(value, i, 1));
+        /*
+         * On most boards this register drives LEDs.
+         *
+         * TODO: for AN536 this controls whether flash and ATCM are
+         * enabled or disabled on reset. QEMU doesn't model this, and
+         * always wires up RAM in the ATCM area and ROM in the flash area.
+         */
+        if (cfg1_is_leds(s)) {
+            for (size_t i = 0; i < ARRAY_SIZE(s->led); i++) {
+                led_set_state(s->led[i], extract32(value, i, 1));
+            }
         }
         break;
     case A_CFG2:
-        if (scc_partno(s) != 0x524 && scc_partno(s) != 0x547) {
-            /* CFG2 reserved on other boards */
+        if (!have_cfg2(s)) {
             goto bad_offset;
         }
-        /* AN524: QSPI Select signal */
+        /* AN524, AN536: QSPI Select signal */
         s->cfg2 = value;
         break;
     case A_CFG5:
-        if (scc_partno(s) != 0x524 && scc_partno(s) != 0x547) {
-            /* CFG5 reserved on other boards */
+        if (!have_cfg5(s)) {
             goto bad_offset;
         }
-        /* AN524: ACLK frequency in Hz */
+        /* AN524, AN536: ACLK frequency in Hz */
         s->cfg5 = value;
         break;
     case A_CFG6:
-        if (scc_partno(s) != 0x524) {
-            /* CFG6 reserved on other boards */
+        if (!have_cfg6(s)) {
             goto bad_offset;
         }
         /* AN524: Clock divider for BRAM */
+        /* AN536: Core 0 vector table base address */
+        s->cfg6 = value;
+        break;
+    case A_CFG7:
+        if (!have_cfg7(s)) {
+            goto bad_offset;
+        }
+        /* AN536: Core 1 vector table base address */
         s->cfg6 = value;
         break;
     case A_CFGDATA_OUT:
@@ -336,6 +412,24 @@
     g_free(s->oscclk_reset);
 }
 
+static bool cfg7_needed(void *opaque)
+{
+    MPS2SCC *s = opaque;
+
+    return have_cfg7(s);
+}
+
+static const VMStateDescription vmstate_cfg7 = {
+    .name = "mps2-scc/cfg7",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = cfg7_needed,
+    .fields = (const VMStateField[]) {
+        VMSTATE_UINT32(cfg7, MPS2SCC),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription mps2_scc_vmstate = {
     .name = "mps2-scc",
     .version_id = 3,
@@ -355,6 +449,10 @@
         VMSTATE_VARRAY_UINT32(oscclk, MPS2SCC, num_oscclk,
                               0, vmstate_info_uint32, uint32_t),
         VMSTATE_END_OF_LIST()
+    },
+    .subsections = (const VMStateDescription * const []) {
+        &vmstate_cfg7,
+        NULL
     }
 };
 
diff --git a/hw/pci-host/raven.c b/hw/pci-host/raven.c
index c7a0a28..a7dfddd 100644
--- a/hw/pci-host/raven.c
+++ b/hw/pci-host/raven.c
@@ -200,6 +200,7 @@
     .write = raven_io_write,
     .endianness = DEVICE_LITTLE_ENDIAN,
     .impl.max_access_size = 4,
+    .impl.unaligned = true,
     .valid.unaligned = true,
 };
 
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index fd8d772..5ec2e6c 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -92,6 +92,7 @@
     bool disabled;             /* smmu is disabled */
     bool bypassed;             /* translation is bypassed */
     bool aborted;              /* translation is aborted */
+    bool affd;                 /* AF fault disable */
     uint32_t iotlb_hits;       /* counts IOTLB hits */
     uint32_t iotlb_misses;     /* counts IOTLB misses*/
     /* Used by stage-1 only. */
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index f692398..bb486d3 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -130,6 +130,7 @@
     /* Machines < 6.2 have no support for describing cpu topology to guest */
     bool no_cpu_topology;
     bool no_tcg_lpa2;
+    bool no_ns_el2_virt_timer_irq;
 };
 
 struct VirtMachineState {
@@ -173,6 +174,7 @@
     PCIBus *bus;
     char *oem_id;
     char *oem_table_id;
+    bool ns_el2_virt_timer_irq;
 };
 
 #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
diff --git a/include/hw/misc/mps2-scc.h b/include/hw/misc/mps2-scc.h
index 3b2d13a..8ff188c 100644
--- a/include/hw/misc/mps2-scc.h
+++ b/include/hw/misc/mps2-scc.h
@@ -51,6 +51,7 @@
     uint32_t cfg4;
     uint32_t cfg5;
     uint32_t cfg6;
+    uint32_t cfg7;
     uint32_t cfgdata_rtn;
     uint32_t cfgdata_out;
     uint32_t cfgctrl;
diff --git a/linux-user/aarch64/target_prctl.h b/linux-user/aarch64/target_prctl.h
index 5067e7d..aa8e203 100644
--- a/linux-user/aarch64/target_prctl.h
+++ b/linux-user/aarch64/target_prctl.h
@@ -173,21 +173,26 @@
     env->tagged_addr_enable = arg2 & PR_TAGGED_ADDR_ENABLE;
 
     if (cpu_isar_feature(aa64_mte, cpu)) {
-        switch (arg2 & PR_MTE_TCF_MASK) {
-        case PR_MTE_TCF_NONE:
-        case PR_MTE_TCF_SYNC:
-        case PR_MTE_TCF_ASYNC:
-            break;
-        default:
-            return -EINVAL;
-        }
-
         /*
          * Write PR_MTE_TCF to SCTLR_EL1[TCF0].
-         * Note that the syscall values are consistent with hw.
+         *
+         * The kernel has a per-cpu configuration for the sysadmin,
+         * /sys/devices/system/cpu/cpu<N>/mte_tcf_preferred,
+         * which qemu does not implement.
+         *
+         * Because there is no performance difference between the modes, and
+         * because SYNC is most useful for debugging MTE errors, choose SYNC
+         * as the preferred mode.  With this preference, and the way the API
+         * uses only two bits, there is no way for the program to select
+         * ASYMM mode.
          */
-        env->cp15.sctlr_el[1] =
-            deposit64(env->cp15.sctlr_el[1], 38, 2, arg2 >> PR_MTE_TCF_SHIFT);
+        unsigned tcf = 0;
+        if (arg2 & PR_MTE_TCF_SYNC) {
+            tcf = 1;
+        } else if (arg2 & PR_MTE_TCF_ASYNC) {
+            tcf = 2;
+        }
+        env->cp15.sctlr_el[1] = deposit64(env->cp15.sctlr_el[1], 38, 2, tcf);
 
         /*
          * Write PR_MTE_TAG to GCR_EL1[Exclude].
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 8c1ff16..90c4fb7 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -1187,13 +1187,21 @@
     bool enabled, prohibited = false, filtered;
     bool secure = arm_is_secure(env);
     int el = arm_current_el(env);
-    uint64_t mdcr_el2 = arm_mdcr_el2_eff(env);
-    uint8_t hpmn = mdcr_el2 & MDCR_HPMN;
+    uint64_t mdcr_el2;
+    uint8_t hpmn;
 
+    /*
+     * We might be called for M-profile cores where MDCR_EL2 doesn't
+     * exist and arm_mdcr_el2_eff() will assert, so this early-exit check
+     * must be before we read that value.
+     */
     if (!arm_feature(env, ARM_FEATURE_PMU)) {
         return false;
     }
 
+    mdcr_el2 = arm_mdcr_el2_eff(env);
+    hpmn = mdcr_el2 & MDCR_HPMN;
+
     if (!arm_feature(env, ARM_FEATURE_EL2) ||
             (counter < hpmn || counter == 31)) {
         e = env->cp15.c9_pmcr & PMCRE;
@@ -9520,7 +9528,7 @@
          * AArch64 cores we might need to add a specific feature flag
          * to indicate cores with "flavour 2" CBAR.
          */
-        if (arm_feature(env, ARM_FEATURE_AARCH64)) {
+        if (arm_feature(env, ARM_FEATURE_V8)) {
             /* 32 bit view is [31:18] 0...0 [43:32]. */
             uint32_t cbar32 = (extract64(cpu->reset_cbar, 18, 14) << 18)
                 | extract64(cpu->reset_cbar, 32, 12);
diff --git a/target/arm/internals.h b/target/arm/internals.h
index fc337fe..50bff44 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1278,7 +1278,7 @@
 FIELD(MTEDESC, TCMA,  6, 2)
 FIELD(MTEDESC, WRITE, 8, 1)
 FIELD(MTEDESC, ALIGN, 9, 3)
-FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - 12)  /* size - 1 */
+FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - SVE_MTEDESC_SHIFT - 12)  /* size - 1 */
 
 bool mte_probe(CPUARMState *env, uint32_t desc, uint64_t ptr);
 uint64_t mte_check(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra);
diff --git a/target/arm/tcg/cpu32.c b/target/arm/tcg/cpu32.c
index 1125305..6eb08a4 100644
--- a/target/arm/tcg/cpu32.c
+++ b/target/arm/tcg/cpu32.c
@@ -800,6 +800,111 @@
     define_arm_cp_regs(cpu, cortexr5_cp_reginfo);
 }
 
+static const ARMCPRegInfo cortex_r52_cp_reginfo[] = {
+    { .name = "CPUACTLR", .cp = 15, .opc1 = 0, .crm = 15,
+      .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
+    { .name = "IMP_ATCMREGIONR",
+      .cp = 15, .opc1 = 0, .crn = 9, .crm = 1, .opc2 = 0,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_BTCMREGIONR",
+      .cp = 15, .opc1 = 0, .crn = 9, .crm = 1, .opc2 = 1,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_CTCMREGIONR",
+      .cp = 15, .opc1 = 0, .crn = 9, .crm = 1, .opc2 = 2,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_CSCTLR",
+      .cp = 15, .opc1 = 1, .crn = 9, .crm = 1, .opc2 = 0,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_BPCTLR",
+      .cp = 15, .opc1 = 1, .crn = 9, .crm = 1, .opc2 = 1,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_MEMPROTCLR",
+      .cp = 15, .opc1 = 1, .crn = 9, .crm = 1, .opc2 = 2,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_SLAVEPCTLR",
+      .cp = 15, .opc1 = 0, .crn = 11, .crm = 0, .opc2 = 0,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_PERIPHREGIONR",
+      .cp = 15, .opc1 = 0, .crn = 15, .crm = 0, .opc2 = 0,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_FLASHIFREGIONR",
+      .cp = 15, .opc1 = 0, .crn = 15, .crm = 0, .opc2 = 1,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_BUILDOPTR",
+      .cp = 15, .opc1 = 0, .crn = 15, .crm = 2, .opc2 = 0,
+      .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_PINOPTR",
+      .cp = 15, .opc1 = 0, .crn = 15, .crm = 2, .opc2 = 7,
+      .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_QOSR",
+      .cp = 15, .opc1 = 1, .crn = 15, .crm = 3, .opc2 = 1,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_BUSTIMEOUTR",
+      .cp = 15, .opc1 = 1, .crn = 15, .crm = 3, .opc2 = 2,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_INTMONR",
+      .cp = 15, .opc1 = 1, .crn = 15, .crm = 3, .opc2 = 4,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_ICERR0",
+      .cp = 15, .opc1 = 2, .crn = 15, .crm = 0, .opc2 = 0,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_ICERR1",
+      .cp = 15, .opc1 = 2, .crn = 15, .crm = 0, .opc2 = 1,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_DCERR0",
+      .cp = 15, .opc1 = 2, .crn = 15, .crm = 1, .opc2 = 0,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_DCERR1",
+      .cp = 15, .opc1 = 2, .crn = 15, .crm = 1, .opc2 = 1,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_TCMERR0",
+      .cp = 15, .opc1 = 2, .crn = 15, .crm = 2, .opc2 = 0,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_TCMERR1",
+      .cp = 15, .opc1 = 2, .crn = 15, .crm = 2, .opc2 = 1,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_TCMSYNDR0",
+      .cp = 15, .opc1 = 2, .crn = 15, .crm = 2, .opc2 = 2,
+      .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_TCMSYNDR1",
+      .cp = 15, .opc1 = 2, .crn = 15, .crm = 2, .opc2 = 3,
+      .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_FLASHERR0",
+      .cp = 15, .opc1 = 2, .crn = 15, .crm = 3, .opc2 = 0,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_FLASHERR1",
+      .cp = 15, .opc1 = 2, .crn = 15, .crm = 3, .opc2 = 1,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_CDBGDR0",
+      .cp = 15, .opc1 = 3, .crn = 15, .crm = 0, .opc2 = 0,
+      .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_CBDGBR1",
+      .cp = 15, .opc1 = 3, .crn = 15, .crm = 0, .opc2 = 1,
+      .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_TESTR0",
+      .cp = 15, .opc1 = 4, .crn = 15, .crm = 0, .opc2 = 0,
+      .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "IMP_TESTR1",
+      .cp = 15, .opc1 = 4, .crn = 15, .crm = 0, .opc2 = 1,
+      .access = PL1_W, .type = ARM_CP_NOP, .resetvalue = 0 },
+    { .name = "IMP_CDBGDCI",
+      .cp = 15, .opc1 = 0, .crn = 15, .crm = 15, .opc2 = 0,
+      .access = PL1_W, .type = ARM_CP_NOP, .resetvalue = 0 },
+    { .name = "IMP_CDBGDCT",
+      .cp = 15, .opc1 = 3, .crn = 15, .crm = 2, .opc2 = 0,
+      .access = PL1_W, .type = ARM_CP_NOP, .resetvalue = 0 },
+    { .name = "IMP_CDBGICT",
+      .cp = 15, .opc1 = 3, .crn = 15, .crm = 2, .opc2 = 1,
+      .access = PL1_W, .type = ARM_CP_NOP, .resetvalue = 0 },
+    { .name = "IMP_CDBGDCD",
+      .cp = 15, .opc1 = 3, .crn = 15, .crm = 4, .opc2 = 0,
+      .access = PL1_W, .type = ARM_CP_NOP, .resetvalue = 0 },
+    { .name = "IMP_CDBGICD",
+      .cp = 15, .opc1 = 3, .crn = 15, .crm = 4, .opc2 = 1,
+      .access = PL1_W, .type = ARM_CP_NOP, .resetvalue = 0 },
+};
+
+
 static void cortex_r52_initfn(Object *obj)
 {
     ARMCPU *cpu = ARM_CPU(obj);
@@ -809,6 +914,8 @@
     set_feature(&cpu->env, ARM_FEATURE_PMSA);
     set_feature(&cpu->env, ARM_FEATURE_NEON);
     set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
+    set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
+    set_feature(&cpu->env, ARM_FEATURE_AUXCR);
     cpu->midr = 0x411fd133; /* r1p3 */
     cpu->revidr = 0x00000000;
     cpu->reset_fpsid = 0x41034023;
@@ -839,6 +946,8 @@
 
     cpu->pmsav7_dregion = 16;
     cpu->pmsav8r_hdregion = 16;
+
+    define_arm_cp_regs(cpu, cortex_r52_cp_reginfo);
 }
 
 static void cortex_r5f_initfn(Object *obj)
diff --git a/target/arm/tcg/op_helper.c b/target/arm/tcg/op_helper.c
index b5ac260..c199b69 100644
--- a/target/arm/tcg/op_helper.c
+++ b/target/arm/tcg/op_helper.c
@@ -570,10 +570,24 @@
      */
     int curmode = env->uncached_cpsr & CPSR_M;
 
-    if (regno == 17) {
-        /* ELR_Hyp: a special case because access from tgtmode is OK */
-        if (curmode != ARM_CPU_MODE_HYP && curmode != ARM_CPU_MODE_MON) {
-            goto undef;
+    if (tgtmode == ARM_CPU_MODE_HYP) {
+        /*
+         * Handle Hyp target regs first because some are special cases
+         * which don't want the usual "not accessible from tgtmode" check.
+         */
+        switch (regno) {
+        case 16 ... 17: /* ELR_Hyp, SPSR_Hyp */
+            if (curmode != ARM_CPU_MODE_HYP && curmode != ARM_CPU_MODE_MON) {
+                goto undef;
+            }
+            break;
+        case 13:
+            if (curmode != ARM_CPU_MODE_MON) {
+                goto undef;
+            }
+            break;
+        default:
+            g_assert_not_reached();
         }
         return;
     }
@@ -604,13 +618,6 @@
         }
     }
 
-    if (tgtmode == ARM_CPU_MODE_HYP) {
-        /* SPSR_Hyp, r13_hyp: accessible from Monitor mode only */
-        if (curmode != ARM_CPU_MODE_MON) {
-            goto undef;
-        }
-    }
-
     return;
 
 undef:
@@ -625,7 +632,12 @@
 
     switch (regno) {
     case 16: /* SPSRs */
-        env->banked_spsr[bank_number(tgtmode)] = value;
+        if (tgtmode == (env->uncached_cpsr & CPSR_M)) {
+            /* Only happens for SPSR_Hyp access in Hyp mode */
+            env->spsr = value;
+        } else {
+            env->banked_spsr[bank_number(tgtmode)] = value;
+        }
         break;
     case 17: /* ELR_Hyp */
         env->elr_el[2] = value;
@@ -659,7 +671,12 @@
 
     switch (regno) {
     case 16: /* SPSRs */
-        return env->banked_spsr[bank_number(tgtmode)];
+        if (tgtmode == (env->uncached_cpsr & CPSR_M)) {
+            /* Only happens for SPSR_Hyp access in Hyp mode */
+            return env->spsr;
+        } else {
+            return env->banked_spsr[bank_number(tgtmode)];
+        }
     case 17: /* ELR_Hyp */
         return env->elr_el[2];
     case 13:
diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c
index 1ee2690..904bfda 100644
--- a/target/arm/tcg/sme_helper.c
+++ b/target/arm/tcg/sme_helper.c
@@ -573,8 +573,8 @@
     desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
     /* Perform gross MTE suppression early. */
-    if (!tbi_check(desc, bit55) ||
-        tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+    if (!tbi_check(mtedesc, bit55) ||
+        tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
         mtedesc = 0;
     }
 
@@ -750,8 +750,8 @@
     desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
     /* Perform gross MTE suppression early. */
-    if (!tbi_check(desc, bit55) ||
-        tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+    if (!tbi_check(mtedesc, bit55) ||
+        tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
         mtedesc = 0;
     }
 
diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
index bce4295..6853f58 100644
--- a/target/arm/tcg/sve_helper.c
+++ b/target/arm/tcg/sve_helper.c
@@ -5800,8 +5800,8 @@
     desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
     /* Perform gross MTE suppression early. */
-    if (!tbi_check(desc, bit55) ||
-        tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+    if (!tbi_check(mtedesc, bit55) ||
+        tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
         mtedesc = 0;
     }
 
@@ -6156,8 +6156,8 @@
     desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
     /* Perform gross MTE suppression early. */
-    if (!tbi_check(desc, bit55) ||
-        tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+    if (!tbi_check(mtedesc, bit55) ||
+        tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
         mtedesc = 0;
     }
 
@@ -6410,8 +6410,8 @@
     desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
     /* Perform gross MTE suppression early. */
-    if (!tbi_check(desc, bit55) ||
-        tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+    if (!tbi_check(mtedesc, bit55) ||
+        tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
         mtedesc = 0;
     }
 
diff --git a/target/arm/tcg/translate-a64.h b/target/arm/tcg/translate-a64.h
index 96ba39b..7b811b8 100644
--- a/target/arm/tcg/translate-a64.h
+++ b/target/arm/tcg/translate-a64.h
@@ -28,6 +28,8 @@
 bool sve_access_check(DisasContext *s);
 bool sme_enabled_check(DisasContext *s);
 bool sme_enabled_check_with_svcr(DisasContext *s, unsigned);
+uint32_t make_svemte_desc(DisasContext *s, unsigned vsz, uint32_t nregs,
+                          uint32_t msz, bool is_write, uint32_t data);
 
 /* This function corresponds to CheckStreamingSVEEnabled. */
 static inline bool sme_sm_enabled_check(DisasContext *s)
diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 8f0dfc8..46c7fce 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -206,7 +206,7 @@
 
     TCGv_ptr t_za, t_pg;
     TCGv_i64 addr;
-    int svl, desc = 0;
+    uint32_t desc;
     bool be = s->be_data == MO_BE;
     bool mte = s->mte_active[0];
 
@@ -224,18 +224,11 @@
     tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), a->esz);
     tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
 
-    if (mte) {
-        desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
-        desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
-        desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
-        desc = FIELD_DP32(desc, MTEDESC, WRITE, a->st);
-        desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (1 << a->esz) - 1);
-        desc <<= SVE_MTEDESC_SHIFT;
-    } else {
+    if (!mte) {
         addr = clean_data_tbi(s, addr);
     }
-    svl = streaming_vec_reg_size(s);
-    desc = simd_desc(svl, svl, desc);
+
+    desc = make_svemte_desc(s, streaming_vec_reg_size(s), 1, a->esz, a->st, 0);
 
     fns[a->esz][be][a->v][mte][a->st](tcg_env, t_za, t_pg, addr,
                                       tcg_constant_i32(desc));
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 296e7d1..ada05aa 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4437,33 +4437,47 @@
     3, 2, 1, 3
 };
 
+uint32_t make_svemte_desc(DisasContext *s, unsigned vsz, uint32_t nregs,
+                          uint32_t msz, bool is_write, uint32_t data)
+{
+    uint32_t sizem1;
+    uint32_t desc = 0;
+
+    /* Assert all of the data fits, with or without MTE enabled. */
+    assert(nregs >= 1 && nregs <= 4);
+    sizem1 = (nregs << msz) - 1;
+    assert(sizem1 <= R_MTEDESC_SIZEM1_MASK >> R_MTEDESC_SIZEM1_SHIFT);
+    assert(data < 1u << SVE_MTEDESC_SHIFT);
+
+    if (s->mte_active[0]) {
+        desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
+        desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
+        desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
+        desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
+        desc = FIELD_DP32(desc, MTEDESC, SIZEM1, sizem1);
+        desc <<= SVE_MTEDESC_SHIFT;
+    }
+    return simd_desc(vsz, vsz, desc | data);
+}
+
 static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
-                       int dtype, uint32_t mte_n, bool is_write,
+                       int dtype, uint32_t nregs, bool is_write,
                        gen_helper_gvec_mem *fn)
 {
-    unsigned vsz = vec_full_reg_size(s);
     TCGv_ptr t_pg;
-    int desc = 0;
+    uint32_t desc;
+
+    if (!s->mte_active[0]) {
+        addr = clean_data_tbi(s, addr);
+    }
 
     /*
      * For e.g. LD4, there are not enough arguments to pass all 4
      * registers as pointers, so encode the regno into the data field.
      * For consistency, do this even for LD1.
      */
-    if (s->mte_active[0]) {
-        int msz = dtype_msz(dtype);
-
-        desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
-        desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
-        desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
-        desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
-        desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (mte_n << msz) - 1);
-        desc <<= SVE_MTEDESC_SHIFT;
-    } else {
-        addr = clean_data_tbi(s, addr);
-    }
-
-    desc = simd_desc(vsz, vsz, zt | desc);
+    desc = make_svemte_desc(s, vec_full_reg_size(s), nregs,
+                            dtype_msz(dtype), is_write, zt);
     t_pg = tcg_temp_new_ptr();
 
     tcg_gen_addi_ptr(t_pg, tcg_env, pred_full_reg_offset(s, pg));
@@ -4600,7 +4614,7 @@
      * accessible via the instruction encoding.
      */
     assert(fn != NULL);
-    do_mem_zpa(s, zt, pg, addr, dtype, nreg, false, fn);
+    do_mem_zpa(s, zt, pg, addr, dtype, nreg + 1, false, fn);
 }
 
 static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a)
@@ -4847,8 +4861,13 @@
     unsigned vsz = vec_full_reg_size(s);
     TCGv_ptr t_pg;
     int poff;
+    uint32_t desc;
 
     /* Load the first quadword using the normal predicated load helpers.  */
+    if (!s->mte_active[0]) {
+        addr = clean_data_tbi(s, addr);
+    }
+
     poff = pred_full_reg_offset(s, pg);
     if (vsz > 16) {
         /*
@@ -4872,7 +4891,8 @@
 
     gen_helper_gvec_mem *fn
         = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0];
-    fn(tcg_env, t_pg, addr, tcg_constant_i32(simd_desc(16, 16, zt)));
+    desc = make_svemte_desc(s, 16, 1, dtype_msz(dtype), false, zt);
+    fn(tcg_env, t_pg, addr, tcg_constant_i32(desc));
 
     /* Replicate that first quadword.  */
     if (vsz > 16) {
@@ -4915,6 +4935,7 @@
     unsigned vsz_r32;
     TCGv_ptr t_pg;
     int poff, doff;
+    uint32_t desc;
 
     if (vsz < 32) {
         /*
@@ -4927,6 +4948,9 @@
     }
 
     /* Load the first octaword using the normal predicated load helpers.  */
+    if (!s->mte_active[0]) {
+        addr = clean_data_tbi(s, addr);
+    }
 
     poff = pred_full_reg_offset(s, pg);
     if (vsz > 32) {
@@ -4951,7 +4975,8 @@
 
     gen_helper_gvec_mem *fn
         = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0];
-    fn(tcg_env, t_pg, addr, tcg_constant_i32(simd_desc(32, 32, zt)));
+    desc = make_svemte_desc(s, 32, 1, dtype_msz(dtype), false, zt);
+    fn(tcg_env, t_pg, addr, tcg_constant_i32(desc));
 
     /*
      * Replicate that first octaword.
@@ -5168,14 +5193,13 @@
     if (nreg == 0) {
         /* ST1 */
         fn = fn_single[s->mte_active[0]][be][msz][esz];
-        nreg = 1;
     } else {
         /* ST2, ST3, ST4 -- msz == esz, enforced by encoding */
         assert(msz == esz);
         fn = fn_multiple[s->mte_active[0]][be][nreg - 1][msz];
     }
     assert(fn != NULL);
-    do_mem_zpa(s, zt, pg, addr, msz_dtype(s, msz), nreg, true, fn);
+    do_mem_zpa(s, zt, pg, addr, msz_dtype(s, msz), nreg + 1, true, fn);
 }
 
 static bool trans_ST_zprr(DisasContext *s, arg_rprr_store *a)
@@ -5223,25 +5247,16 @@
                        int scale, TCGv_i64 scalar, int msz, bool is_write,
                        gen_helper_gvec_mem_scatter *fn)
 {
-    unsigned vsz = vec_full_reg_size(s);
     TCGv_ptr t_zm = tcg_temp_new_ptr();
     TCGv_ptr t_pg = tcg_temp_new_ptr();
     TCGv_ptr t_zt = tcg_temp_new_ptr();
-    int desc = 0;
-
-    if (s->mte_active[0]) {
-        desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
-        desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
-        desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
-        desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
-        desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (1 << msz) - 1);
-        desc <<= SVE_MTEDESC_SHIFT;
-    }
-    desc = simd_desc(vsz, vsz, desc | scale);
+    uint32_t desc;
 
     tcg_gen_addi_ptr(t_pg, tcg_env, pred_full_reg_offset(s, pg));
     tcg_gen_addi_ptr(t_zm, tcg_env, vec_full_reg_offset(s, zm));
     tcg_gen_addi_ptr(t_zt, tcg_env, vec_full_reg_offset(s, zt));
+
+    desc = make_svemte_desc(s, vec_full_reg_size(s), 1, msz, is_write, scale);
     fn(tcg_env, t_zt, t_pg, t_zm, scalar, tcg_constant_i32(desc));
 }
 
diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c
index 5fa8249..f947c62 100644
--- a/target/arm/tcg/translate.c
+++ b/target/arm/tcg/translate.c
@@ -2822,13 +2822,20 @@
         break;
     case ARM_CPU_MODE_HYP:
         /*
-         * SPSR_hyp and r13_hyp can only be accessed from Monitor mode
-         * (and so we can forbid accesses from EL2 or below). elr_hyp
-         * can be accessed also from Hyp mode, so forbid accesses from
-         * EL0 or EL1.
+         * r13_hyp can only be accessed from Monitor mode, and so we
+         * can forbid accesses from EL2 or below.
+         * elr_hyp can be accessed also from Hyp mode, so forbid
+         * accesses from EL0 or EL1.
+         * SPSR_hyp is supposed to be in the same category as r13_hyp
+         * and UNPREDICTABLE if accessed from anything except Monitor
+         * mode. However there is some real-world code that will do
+         * it because at least some hardware happens to permit the
+         * access. (Notably a standard Cortex-R52 startup code fragment
+         * does this.) So we permit SPSR_hyp from Hyp mode also, to allow
+         * this (incorrect) guest code to run.
          */
-        if (!arm_dc_feature(s, ARM_FEATURE_EL2) || s->current_el < 2 ||
-            (s->current_el < 3 && *regno != 17)) {
+        if (!arm_dc_feature(s, ARM_FEATURE_EL2) || s->current_el < 2
+            || (s->current_el < 3 && *regno != 16 && *regno != 17)) {
             goto undef;
         }
         break;
diff --git a/tests/data/acpi/virt/FACP b/tests/data/acpi/virt/FACP
index ac05c35..da0c364 100644
--- a/tests/data/acpi/virt/FACP
+++ b/tests/data/acpi/virt/FACP
Binary files differ
diff --git a/tests/data/acpi/virt/GTDT b/tests/data/acpi/virt/GTDT
index 6f8cb9b..7f330e0 100644
--- a/tests/data/acpi/virt/GTDT
+++ b/tests/data/acpi/virt/GTDT
Binary files differ
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 39557d5..2b89e86 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -192,7 +192,8 @@
    'npcm7xx_sdhci-test',
    'npcm7xx_smbus-test',
    'npcm7xx_timer-test',
-   'npcm7xx_watchdog_timer-test'] + \
+   'npcm7xx_watchdog_timer-test',
+   'npcm_gmac-test'] + \
    (slirp.found() ? ['npcm7xx_emc-test'] : [])
 qtests_aspeed = \
   ['aspeed_hace-test',
@@ -231,7 +232,6 @@
   (config_all_devices.has_key('CONFIG_RASPI') ? ['bcm2835-dma-test'] : []) +  \
   (config_all_accel.has_key('CONFIG_TCG') and                                            \
    config_all_devices.has_key('CONFIG_TPM_TIS_I2C') ? ['tpm-tis-i2c-test'] : []) + \
-  (config_all_devices.has_key('CONFIG_NPCM7XX') ? qtests_npcm7xx : []) + \
   ['arm-cpu-features',
    'numa-test',
    'boot-serial-test',
diff --git a/tests/qtest/npcm7xx_emc-test.c b/tests/qtest/npcm7xx_emc-test.c
index f7646fa..63f6cad 100644
--- a/tests/qtest/npcm7xx_emc-test.c
+++ b/tests/qtest/npcm7xx_emc-test.c
@@ -228,7 +228,10 @@
      * KISS and use -nic. The driver accepts 'emc0' and 'emc1' as aliases
      * in the 'model' field to specify the device to match.
      */
-    g_string_append_printf(cmd_line, " -nic socket,fd=%d,model=emc%d ",
+    g_string_append_printf(cmd_line, " -nic socket,fd=%d,model=emc%d "
+                           "-nic user,model=npcm7xx-emc "
+                           "-nic user,model=npcm-gmac "
+                           "-nic user,model=npcm-gmac",
                            test_sockets[1], module_num);
 
     g_test_queue_destroy(packet_test_clear, test_sockets);
diff --git a/tests/qtest/npcm_gmac-test.c b/tests/qtest/npcm_gmac-test.c
index 9e58b15..c28b471 100644
--- a/tests/qtest/npcm_gmac-test.c
+++ b/tests/qtest/npcm_gmac-test.c
@@ -36,7 +36,7 @@
     const GMACModule *module;
 } TestData;
 
-/* Values extracted from hw/arm/npcm8xx.c */
+/* Values extracted from hw/arm/npcm7xx.c */
 static const GMACModule gmac_module_list[] = {
     {
         .irq        = 14,
@@ -46,14 +46,6 @@
         .irq        = 15,
         .base_addr  = 0xf0804000
     },
-    {
-        .irq        = 16,
-        .base_addr  = 0xf0806000
-    },
-    {
-        .irq        = 17,
-        .base_addr  = 0xf0808000
-    }
 };
 
 /* Returns the index of the GMAC module. */
@@ -182,32 +174,18 @@
     return qtest_readl(qts, mod->base_addr + regno);
 }
 
-static uint16_t pcs_read(QTestState *qts, const GMACModule *mod,
-                          NPCMRegister regno)
-{
-    uint32_t write_value = (regno & 0x3ffe00) >> 9;
-    qtest_writel(qts, PCS_BASE_ADDRESS + NPCM_PCS_IND_AC_BA, write_value);
-    uint32_t read_offset = regno & 0x1ff;
-    return qtest_readl(qts, PCS_BASE_ADDRESS + read_offset);
-}
-
 /* Check that GMAC registers are reset to default value */
 static void test_init(gconstpointer test_data)
 {
     const TestData *td = test_data;
     const GMACModule *mod = td->module;
-    QTestState *qts = qtest_init("-machine npcm845-evb");
+    QTestState *qts = qtest_init("-machine npcm750-evb");
 
 #define CHECK_REG32(regno, value) \
     do { \
         g_assert_cmphex(gmac_read(qts, mod, (regno)), ==, (value)); \
     } while (0)
 
-#define CHECK_REG_PCS(regno, value) \
-    do { \
-        g_assert_cmphex(pcs_read(qts, mod, (regno)), ==, (value)); \
-    } while (0)
-
     CHECK_REG32(NPCM_DMA_BUS_MODE, 0x00020100);
     CHECK_REG32(NPCM_DMA_XMT_POLL_DEMAND, 0);
     CHECK_REG32(NPCM_DMA_RCV_POLL_DEMAND, 0);
@@ -257,64 +235,6 @@
     CHECK_REG32(NPCM_GMAC_PTP_TAR, 0);
     CHECK_REG32(NPCM_GMAC_PTP_TTSR, 0);
 
-    /* TODO Add registers PCS */
-    if (mod->base_addr == 0xf0802000) {
-        CHECK_REG_PCS(NPCM_PCS_SR_CTL_ID1, 0x699e);
-        CHECK_REG_PCS(NPCM_PCS_SR_CTL_ID2, 0);
-        CHECK_REG_PCS(NPCM_PCS_SR_CTL_STS, 0x8000);
-
-        CHECK_REG_PCS(NPCM_PCS_SR_MII_CTRL, 0x1140);
-        CHECK_REG_PCS(NPCM_PCS_SR_MII_STS, 0x0109);
-        CHECK_REG_PCS(NPCM_PCS_SR_MII_DEV_ID1, 0x699e);
-        CHECK_REG_PCS(NPCM_PCS_SR_MII_DEV_ID2, 0x0ced0);
-        CHECK_REG_PCS(NPCM_PCS_SR_MII_AN_ADV, 0x0020);
-        CHECK_REG_PCS(NPCM_PCS_SR_MII_LP_BABL, 0);
-        CHECK_REG_PCS(NPCM_PCS_SR_MII_AN_EXPN, 0);
-        CHECK_REG_PCS(NPCM_PCS_SR_MII_EXT_STS, 0xc000);
-
-        CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_ABL, 0x0003);
-        CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_TX_MAX_DLY_LWR, 0x0038);
-        CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_TX_MAX_DLY_UPR, 0);
-        CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_TX_MIN_DLY_LWR, 0x0038);
-        CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_TX_MIN_DLY_UPR, 0);
-        CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_RX_MAX_DLY_LWR, 0x0058);
-        CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_RX_MAX_DLY_UPR, 0);
-        CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_RX_MIN_DLY_LWR, 0x0048);
-        CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_RX_MIN_DLY_UPR, 0);
-
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MMD_DIG_CTRL1, 0x2400);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_AN_CTRL, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_AN_INTR_STS, 0x000a);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_TC, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_DBG_CTRL, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_EEE_MCTRL0, 0x899c);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_EEE_TXTIMER, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_EEE_RXTIMER, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_LINK_TIMER_CTRL, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_EEE_MCTRL1, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_DIG_STS, 0x0010);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_ICG_ERRCNT1, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MISC_STS, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_RX_LSTS, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_TX_BSTCTRL0, 0x00a);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_TX_LVLCTRL0, 0x007f);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_TX_GENCTRL0, 0x0001);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_TX_GENCTRL1, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_TX_STS, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_RX_GENCTRL0, 0x0100);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_RX_GENCTRL1, 0x1100);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_RX_LOS_CTRL0, 0x000e);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_MPLL_CTRL0, 0x0100);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_MPLL_CTRL1, 0x0032);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_MPLL_STS, 0x0001);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_MISC_CTRL2, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_LVL_CTRL, 0x0019);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_MISC_CTRL0, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_MISC_CTRL1, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_DIG_CTRL2, 0);
-        CHECK_REG_PCS(NPCM_PCS_VR_MII_DIG_ERRCNT_SEL, 0);
-    }
-
     qtest_quit(qts);
 }