doc/release-notes/skiboot-5.10-rc1.rst - skiboot - Git at Google

 .. _skiboot-5.10-rc1:

 skiboot-5.10-rc1
 ================

 skiboot v5.10-rc1 was released on Tuesday February 6th 2018. It is the first
 release candidate of skiboot 5.10, which will become the new stable release
 of skiboot following the 5.9 release, first released October 31st 2017.

 skiboot v5.10-rc1 contains all bug fixes as of :ref:`skiboot-5.9.8`
 and :ref:`skiboot-5.4.9` (the currently maintained stable releases). There
 may be more 5.9.x stable releases, it will depend on demand.

 For how the skiboot stable releases work, see :ref:`stable-rules` for details.

 The current plan is to cut the final 5.10 in February, with skiboot 5.10
 being for all POWER8 and POWER9 platforms in op-build v1.21.
 This release will be targeted to early POWER9 systems.

 Over skiboot-5.9, we have the following changes:

 New Features
 ------------
 - hdata: Parse IPL FW feature settings

   Add parsing for the firmware feature flags in the HDAT. This
   indicates the settings of various parameters which are set at IPL time
   by firmware.

 - opal/xstop: Use nvram option to enable/disable sw checkstop.

   Add a mechanism to enable/disable sw checkstop by looking at nvram option
   opal-sw-xstop=<enable/disable>.

   For now this patch disables the sw checkstop trigger unless explicitly
   enabled through nvram option 'opal-sw-xstop=enable'i for p9. This will allow
   an opportunity to get host kernel in panic path or xmon for unrecoverable
   HMIs or MCE, to be able to debug the issue effectively.

   To enable sw checkstop in opal issue following command: ::

     nvram -p ibm,skiboot --update-config opal-sw-xstop=enable

   **NOTE:** This is a workaround patch to disable sw checkstop by default to gain
   control in host kernel for better checkstop debugging. Once we have most of
   the checkstop issues stabilized/resolved, revisit this patch to enable sw
   checkstop by default.

   For p8 platform it will remain enabled by default unless explicitly disabled.

   To disable sw checkstop on p8 issue following command: ::

     nvram -p ibm,skiboot --update-config opal-sw-xstop=disable
 - hdata: Parse SPD data

     Parse SPD data and populate device tree.

     list of properties parsing from SPD: ::

       [root@ltc-wspoon dimm@d00f]# lsprop .
       memory-id        0000000c (12)      # DIMM type
       product-version  00000032 (50)      # Module Revision Code
       device_type      "memory-dimm-ddr4"
       serial-number    15d9acb6 (366587062)
       status           "okay"
       size             00004000 (16384)
       phandle          000000bd (189)
       ibm,loc-code     "UOPWR.0000000-Node0-DIMM7"
       part-number      "36ASF2G72PZ-2G6B2   "
       reg              0000d007 (53255)
       name             "dimm"
       manufacturer-id  0000802c (32812)  # Vendor ID, we can get vendor name from this ID

     Also update documentation.
 - hdata: Add memory hierarchy under xscom node

   We have memory to chip mapping but doesn't have complete memory hierarchy.
   This patch adds memory hierarchy under xscom node. This is specific to
   P9 system as these hierarchy may change between processor generation.

   It uses memory controller ID details and populates nodes like:
       xscom@<addr>/mcbist@<mcbist_id>/mcs@<mcs_id>/mca@<mca_id>/dimm@<resource_id>

   Also this patch adds few properties under dimm node.
   Finally make sure xscom nodes created before calling memory_parse().

 Fast Reboot and Quiesce
 ^^^^^^^^^^^^^^^^^^^^^^^
 We have a preliminary fast reboot implementation for POWER9 systems, which
 we look to enabling by default in the next release.

 The OPAL Quiesce calls are designed to improve reliability and debuggability
 around reboot and error conditions. See the full API documentation for details:
 :ref:`OPAL_QUIESCE`.

 - fast-reboot: bare bones fast reboot implementation for POWER9

   This is an initial fast reboot implementation for p9 which has only been
   tested on the Witherspoon platform, and without the use of NPUs, NX/VAS,
   etc.

   This has worked reasonably well so far, with no failures in about 100
   reboots. It is hidden behind the traditional fast-reboot experimental
   nvram option, until more platforms and configurations are tested.
 - fast-reboot: move boot CPU clean-up logically together with secondaries

   Move the boot CPU clean-up and state transition to active, logically
   together with secondaries. Don't release secondaries from fast reboot
   hold until everyone has cleaned up and transitioned to active.

   This is cosmetic, but it is helpful to run the fast reboot state machine
   the same way on all CPUs.
 - fast-reboot: improve failure error messages

   Change existing failure error messages to PR_NOTICE so they get
   printed to the console, and add some new ones. It's not a more
   severe class because it falls back to IPL on failure.
 - fast-reboot: quiesce opal before initiating a fast reboot

   Switch fast reboot to use quiescing rather than "wait for a while".

   If firmware can not be quiesced, then fast reboot is skipped. This
   significantly improves the robustness of fast reboot in the face of
   bugs or unexpected latencies.

   Complexity of synchronization in fast-reboot is reduced, because we
   are guaranteed to be single-threaded when quiesce succeeds, so locks
   can be removed.

   In the case that firmware can be quiesced, then it will generally
   reduce fast reboot times by nearly 200ms, because quiescing usually
   takes very little time.
 - core: Add support for quiescing OPAL

   Quiescing is ensuring all host controlled CPUs (except the current
   one) are out of OPAL and prevented from entering. This can be use in
   debug and shutdown paths, particularly with system reset sequences.

   This patch adds per-CPU entry and exit tracking for OPAL calls, and
   adds logic to "hold" or "reject" at entry time, if OPAL is quiesced.

   An OPAL call is added, to expose the functionality to Linux, where it
   can be used for shutdown, kexec, and before generating sreset IPIs for
   debugging (so the debug code does not recurse into OPAL).
 - dctl: p9 increase thread quiesce timeout

   We require all instructions to be completed before a thread is
   considered stopped, by the dctl interface. Long running instructions
   like cache misses and CI loads may take a significant amount of time
   to complete, and timeouts have been observed in stress testing.

   Increase the timeout significantly, to cover this. The workbook
   just says to poll, but we like to have timeouts to avoid getting
   stuck in firmware.


 POWER9 power saving
 ^^^^^^^^^^^^^^^^^^^

 There is much improved support for deeper sleep/idle (stop) states on POWER9.

 - OCC: Increase max pstate check on P9 to 255

   This has changed from P8, we can now have > 127 pstates.

   This was observed on Boston during WoF bring up.
 - SLW: Add idle state stop5 for DD2.0 and above

   Adding stop5 idle state with rough residency and latency numbers.
 - SLW: Add p9_stop_api calls for IMC

   Add p9_stop_api for EVENT_MASK and PDBAR scoms. These scoms are lost on
   wakeup from stop11.

 - SCOM restore for DARN and XIVE

   While waking up from stop11, we want NCU_DARN_BAR to have enable bit set.
   Without this stop_api call, the value restored is without enable bit set.
   We loose NCU_SPEC_BAR when the quad goes into stop11, stop_api will
   restore while waking up from stop11.

 - SLW: Call p9_stop_api only if deep_states are enabled

   All init time p9_stop_api calls have been isolated to slw_late_init. If
   p9_stop_api fails, then the deep states can be excluded from device tree.

   For p9_stop_api called after device-tree for cpuidle is created ,
   has_deep_states will be used to check if this call is even required.
 - Better handle errors in setting up sleep states (p9_stop_api)

   We won't put affected stop states in the device tree if the wakeup
   engine is not present or has failed.
 - SCOM Restore: Increased the EQ SCOM restore limit.

   Commit increases the SCOM restore limit from 16 to 31.
 - hw/dts: retry special wakeup operation if core still gated

   It has been observed that in some cases the special wakeup
   operation can "succeed" but the core is still in a gated/offline
   state.

   Check for this state after attempting to wakeup a core and retry
   the wakeup if necessary.
 - core/direct-controls: add function to read core gated state
 - core/direct-controls: wait for core special wkup bit cleared

   When clearing special wakeup bit on a core, wait until the
   bit is actually cleared by the hardware in the status register
   until returning success.

   This may help avoid issues with back-to-back reads where the
   special wakeup request is cleared but the firmware is still
   processing the request and the next attempt to set the bit
   reads an immediate success from the previous operation.
 - p9_stop_api: PM: Added support for version control in SCOM restore entries.

   - adds version info in SCOM restore entry header
   - adds version specific details in SCOM restore entry header
   - retains old behaviour of SGPE Hcode's base version
 - p9_stop_api: EQ SCOM Restore: Introduced version control in SCOM restore entry.

   - introduces version control in header of SCOM restore entry
   - ensures backward compatibility
   - introduces flexibility to handle any number of SCOM restore entry.

 Secure and Trusted Boot for POWER9
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 We introduce support for Secure and Trusted Boot for POWER9 systems, with equal
 functionality that we have on POWER8 systems, that is, we have the mechanisms in
 place to boot to petitboot (i.e. to BOOTKERNEL).

 See the :ref:`stb-overview` for full documentation of OPAL secure and trusted boot.

 - allow secure boot if not enforcing it

   We check the secure boot containers no matter what, only *enforcing*
   secure boot if we're booting in secure mode. This gives us an extra
   layer of checking firmware is legit even when secure mode isn't enabled,
   as well as being really useful for testing.
 - libstb/(create|print)-container: Sync with sb-signing-utils

   The sb-signing-utils project has improved upon the skeleton
   create-container tool that existed in skiboot, including
   being able to (quite easily) create *signed* images.

   This commit brings in that code (and makes it build in the
   skiboot build environment) and updates our skiboot.*.stb
   generating code to use the development keys. This means that by
   default, skiboot build process will let you build firmware that can
   do a secure boot with *development* keys.

   See :ref:`signing-firmware-code` for details on firmware signing.

   We also update print-container as well, syncing it with the
   upstream project.

   Derived from github.com:open-power/sb-signing-utils.git
   at v0.3-5-gcb111c03ad7f
   (Some discussion ongoing on the changes, another sync will come shortly)

 - doc: update libstb documentation with POWER9 changes.
   See: :ref:`stb-overview`.

   POWER9 changes reflected in the libstb:

     - bumped ibm,secureboot node to v2
     - added ibm,cvc node
     - hash-algo superseded by hw-key-hash-size

 - libstb/cvc: update memory-region to point to /reserved-memory

   The linux documentation, reserved-memory.txt, says that memory-region is
   a phandle that pairs to a children of /reserved-memory.

   This updates /ibm,secureboot/ibm,cvc/memory-region to point to
     /reserved-memory/secure-crypt-algo-code instead of
     /ibm,hostboot/reserved-memory/secure-crypt-algo-code.
 - libstb: add support for ibm,secureboot-v2

   ibm,secureboot-v2 changes:

     - The Container Verification Code is represented by the ibm,cvc node.
     - Each ibm,cvc child describes a CVC service.
     - hash-algo is superseded by hw-key-hash-size.
 - hdata/tpmrel.c: add ibm, cvc device tree node

   In P9, the Container Verification Code is stored in a hostboot reserved
   memory and the list of provided CVC services is stored in the
   TPMREL_IDATA_HASH_VERIF_OFFSETS idata array. Each CVC service has an
   offset and version.

   This adds the ibm,cvc device tree node and its documentation.
 - hdata/tpmrel.c: add firmware event log info to the tpm node

   This parses the firmware event log information from the
   secureboot_tpm_info HDAT structure and add it to the tpm device tree
   node.

   There can be multiple secureboot_tpm_info entries with each entry
   corresponding to a master processor that has a tpm device, however,
   multiple tpm is not supported.
 - hdata/spira: add ibm,secureboot node in P9

   In P9, skiboot builds the device tree from the HDAT. These are the
   "ibm,secureboot" node changes compared to P8:

     - The Container-Verification-Code (CVC), a.k.a. ROM code, is no longer
       stored in a secure ROM with static address. In P9, it is stored in a
       hostboot reserved memory and each service provided also has a version,
       not only an offset.
     - The hash-algo property is not provided via HDAT, instead it provides
       the hw-key-hash-size, which is indeed the information required by the
       CVC to verify containers.

   This parses the iplparams_sysparams HDAT structure and creates the
   "ibm,secureboot", which is bumped to "ibm,secureboot-v2".

   In "ibm,secureboot-v2":

     - hash-algo property is superseded by hw-key-hash-size.
     - container verification code is explicitly described by a child node.
       Added in a subsequent patch.

   See :ref:`device-tree/ibm,secureboot` for documentation.
 - libstb/tpm_chip.c: define pr_fmt and fix messages logged

   This defines pr_fmt and also fix messages logged:

     - EV_SEPARATOR instead of 0xFFFFFFFF
     - when an event is measured it also prints the tpm id, event type and
       event log length

   Now we can filter the messages logged by libstb and its
   sub-modules by running: ::

     grep STB /sys/firmware/opal/msglog
 - libstb/tss: update the list of event types supported

   Skiboot, precisely the tpmLogMgr, initializes the firmware event log by
   calculating its length so that a new event can be recorded without
   exceeding the log size. In order to calculate the size, it walks through
   the log until it finds a specific event type. However, if the log has
   an unknown event type, the tpmLogMgr will not be able to reach the end
   of the log.

   This updates the list of event types with all of those supported by
   hostboot. Thus, skiboot can properly calculate the event log length.
 - tpm_i2c_nuvoton: add nuvoton, npct601 to the compatible property

   The linux kernel doesn't have a driver compatible with
   "nuvoton,npct650", but it does have for "nuvoton,npct601", which should
   also be compatible with npct650.

   This adds "nuvoton,npct601" to the compatible devtree property.
 - libstb/trustedboot.c: import stb_final() from stb.c

   The stb_final() primary goal is to measure the event EV_SEPARATOR
   into PCR[0-7] when trusted boot is about to exit the boot services.

   This imports the stb_final() from stb.c into trustedboot.c, but making
   the following changes:

     - Rename it to trustedboot_exit_boot_services().
     - As specified in the TCG PC Client specification, EV_SEPARATOR events must
       be logged with the name 0xFFFFFF.
     - Remove the ROM driver clean-up call.
     - Don't allow code to be measured in skiboot after
       trustedboot_exit_boot_services() is called.
 - libstb/cvc.c: import softrom behaviour from drivers/sw_driver.c

   Softrom is used only for testing with mambo. By setting
   compatible="ibm,secureboot-v1-softrom" in the "ibm,secureboot" node,
   firmware images can be properly measured even if the
   Container-Verification-Code (CVC) is not available. In this case, the
   mbedtls_sha512() function is used to calculate the sha512 hash of the
   firmware images.

   This imports the softrom behaviour from libstb/drivers/sw_driver.c code
   into cvc.c, but now softrom is implemented as a flag. When the flag is
   set, the wrappers for the CVC services work the same way as in
   sw_driver.c.
 - libstb/trustedboot.c: import tb_measure() from stb.c

   This imports tb_measure() from stb.c, but now it calls the CVC sha512
   wrapper to calculate the sha512 hash of the firmware image provided.

   In trustedboot.c, the tb_measure() is renamed to trustedboot_measure().

   The new function, trustedboot_measure(), no longer checks if the
   container payload hash calculated at boot time matches with the hash
   found in the container header. A few reasons:

   - If the system admin wants the container header to be
     checked/validated, the secure boot jumper must be set. Otherwise,
     the container header information may not be reliable.
   - The container layout is expected to change over time. Skiboot
     would need to maintain a parser for each container layout
     change.
   - Skiboot could be checking the hash against a container version that
     is not supported by the Container-Verification-Code (CVC).

     The tb_measure() calls are updated to trustedboot_measure() in a
     subsequent patch.
 - libstb/secureboot.c: import sb_verify() from stb.c

   This imports the sb_verify() function from stb.c, but now it calls the
   CVC verify wrapper in order to verify signed firmware images. The
   hw-key-hash and hw-key-hash-size initialized in secureboot.c are passed
   to the CVC verify function wrapper.

   In secureboot.c, the sb_verify() is renamed to secureboot_verify(). The
   sb_verify() calls are updated in a subsequent patch.

 XIVE
 ----
 - xive: Don't bother cleaning up disabled EQs in reset

   Additionally, warn if we find an enabled one that isn't one
   of the firmware built-in queues.
 - xive: Warn on valid VPs found in abnormal cases

   If an allocated VP is left valid at xive_reset() or Linux tries
   to free a valid (enabled) VP block, print errors. The former happens
   occasionally if kdump'ing while KVM is running so keep it as a debug
   message. The latter is a programming error in Linux so use a an
   error log level.
 - xive: Properly reserve built-in VPs in non-group mode

   This is not normally used but if the #define is changed to
   disable block group mode we would incorrectly clear the
   buddy completely without marking the built-in VPs reserved.
 - xive: Quieten debug messages in standard builds

   This makes a bunch of messages, especially the per-CPU ones,
   only enabled in debug builds. This avoids clogging up the
   OPAL logs with XIVE related messages that have proven not
   being particularly useful for field defects.
 - xive: Implement "single escalation" feature

   This adds a new VP flag to control the new DD2.0
   "single escalation" feature.

   This feature allows us to have a single escalation
   interrupt per VP instead of one per queue.

   It works by hijacking queue 7 (which is this no longer
   usable when that is enabled) and exploiting two new
   hardware bits that will:

   - Make the normal queues (0..6) escalate unconditionally
     thus ignoring the ESe bits.
   - Route the above escalations to queue 7
   - Have queue 7 silently escalate without notification

   Thus the escalation of queue 7 becomes the one escalation
   interrupt for all the other queues.
 - xive: When disabling a VP, wipe all of its settings
 - xive: Improve cleaning up of EQs

   Factors out the function that sets an EQ back to a clean
   state and add a cleaning pass for queue left enabled
   when freeing a block of VPs.
 - xive: When disabling an EQ, wipe all of its settings

   This avoids having configuration bits left over
 - xive: Define API for single-escalation VP mode

   This mode allows all queues of a VP to use the same
   escalation interrupt, at the cost of losing priority 7.

   This adds the definition and documentation of the API,
   the implementation will come next.
 - xive: Fix ability to clear some EQ flags

   We could never clear "unconditional notify" and "escalate"
 - xive: Update inits for DD2.0

   This updates some inits based on information from the HW
   designers. This includes enabling some new DD2.0 features
   that we don't yet exploit.
 - xive: Ensure VC informational FIRs are masked

   Some HostBoot versions leave those as checkstop, they are harmless
   and can sometimes occur during normal operations.
 - xive: Fix occasional VC checkstops in xive_reset

   The current workaround for the scrub bug described in
   __xive_cache_scrub() has an issue in that it can leave
   dirty invalid entries in the cache.

   When cleaning up EQs or VPs during reset, if we then
   remove the underlying indirect page for these entries,
   the XIVE will checkstop when trying to flush them out
   of the cache.

   This replaces the existing workaround with a new pair of
   workarounds for VPs and EQs:

   - The VP one does the dummy watch on another entry than
     the one we scrubbed (which does the job of pushing old
     stores out) using an entry that is known to be backed by
     a permanent indirect page.
   - The EQ one switches to a more efficient workaround
     which consists of doing a non-side-effect ESB load from
     the EQ's ESe control bits.
 - xive: Do not return a trigger page for an escalation interrupt

   This is bogus, we don't support them. (Thankfully the callers
   didn't actually try to use this on escalation interrupts).
 - xive: Mark a freed IRQs IVE as valid and masked

   Removing the valid bit means a FIR will trip if it's accessed
   inadvertently. Under some circumstances, the XIVE will speculatively
   access an IVE for a masked interrupt and trip it. So make sure that
   freed entries are still marked valid (but masked).

 PCI
 ---

 - pci: Shared slot state synchronisation for hot reset

   When a device is shared between two PHBs, it doesn't get reset properly
   unless both PHBs issue a hot reset at "the same time".  Practically this
   means a hot reset needs to be issued on both sides, and neither should
   bring the link up until the reset on both has completed.
 - pci: Track peers of slots

   Witherspoon introduced a new concept where one physical slot is shared
   between two PHBs.  Making a slot aware of its peer enables syncing
   between them where necessary.

 PHB4
 ----
 - phb4: Change PCI MMIO timers

   Currently we have a mismatch between the NCU and PCI timers for MMIO
   accesses. The PCI timers must be lower than the NCU timers otherwise
   it may cause checkstops.

   This changes PCI timeouts controlled by skiboot to 33-50ms. It should
   be forwards and backwards compatible with expected hostboot changes to
   the NCU timer.
 - phb4: Change default GEN3 lane equalisation setting to 0x54

   Currently our GEN3 lane equalisation settings are set to 0x77. Change
   this to 0x54. This change will allow us to train at GEN3 in a shorter
   time and more consistently.

   This setting gives us a TX preset 0x4 and RX hint 0x5. This gives a
   boost in gain for high frequency signalling. It allows the most optimal
   continuous time linear equalizers (CTLE) for the remote receiver port
   and de-emphasis and pre-shoot for the remote transmitter port.

   Machine Readable Workbooks (MRW) are moving to this new value also.
 - phb4: Init changes

   These init changes for phb4 from the HW team.

   Link down are now endpoint recoverable (ERC) rather than PHB fatal
   errors.

   BLIF Completion Timeout Error now generate an interrupt rather than
   causing freeze events.
 - phb4: Fix lane equalisation setting

   Fix cut and paste from phb3. The sizes have changes now we have GEN4,
   so the check here needs to change also

   Without this we end up with the default settings (all '7') rather
   than what's in HDAT.
 - hdata: Fix copying GEN4 lane equalisation settings

   These aren't copied currently but should be.
 - phb4: Fix PE mapping of M32 BAR

   The M32 BAR is the PHB4 region used to map all the non-prefetchable
   or 32-bit device BARs. It's supposed to have its segments remapped
   via the MDT and Linux relies on that to assign them individual PE#.

   However, we weren't configuring that properly and instead used the
   mode where PE# == segment#, thus causing EEH to freeze the wrong
   device or PE#.
 - phb4: Fix lost bit in PE number on config accesses

   A PE number can be up to 9 bits, using a uint8_t won't fly..

   That was causing error on config accesses to freeze the
   wrong PE.
 - phb4: Update inits

   New init value from HW folks for the fence enable register.

   This clears bit 17 (CFG Write Error CA or UR response) and bit 22 (MMIO Write
   DAT_ERR Indication) and sets bit 21 (MMIO CFG Pending Error)

 CAPI
 ----

 - capi: Disable CAPP virtual machines

   When exercising more than one CAPI accelerators simultaneously in
   cache coherency mode, the verification team is seeing a deadlock. To
   fix this a workaround of disabling CAPP virtual machines is
   suggested. These 'virtual machines' let PSL queue multiple CAPP
   commands for servicing by CAPP there by increasing
   throughput. Below is the error scenario described by the h/w team:

   " With virtual machines enabled we had a deadlock scenario where with 2
   or more CAPI's in a system you could get in a deadlock scenario due to
   cast-outs that are required break the deadlock (evict lines that
   another CAPI is requesting) get stuck in the virtual machine queue by
   a command ahead of it that is being retried by the same scenario in
   the other CAPI. "

 - capi: Perform capp recovery sequence only when PBCQ is idle

   Presently during a CRESET the CAPP recovery sequence can be executed
   multiple times in case PBCQ on the PEC is still busy processing in/out
   bound in-flight transactions.
 - xive: Mask MMIO load/store to bad location FIR

   For opencapi, the trigger page of an interrupt is mapped to user
   space. The intent is to write the page to raise an interrupt but
   there's nothing to prevent a user process from reading it, which has
   the unfortunate consequence of checkstopping the system.

   Mask the FIR bit raised when an MMIO operation targets an invalid
   location. It's the recommendation from recent documentation and
   hostboot is expected to mask it at some point. In the meantime, let's
   play it safe.
 - phb4: Dump CAPP error registers when it asserts link down

   This patch introduces a new function phb4_dump_app_err_regs() that
   dumps CAPP error registers in case the PEC nestfir register indicates
   that the fence was due to a CAPP error (BIT-24).

   Contents of these registers are helpful in diagnosing CAPP
   issues. Registers that are dumped in phb4_dump_app_err_regs() are:

     * CAPP FIR Register
     * CAPP APC Master Error Report Register
     * CAPP Snoop Error Report Register
     * CAPP Transport Error Report Register
     * CAPP TLBI Error Report Register
     * CAPP Error Status and Control Register
 - capi: move the acknowledge of the HMI interrupt

   We need to acknowledge an eventual HMI initiated by the previous forced
   fence on the PHB to work around a non-existent PE in the phb4_creset()
   function.
   For this reason do_capp_recovery_scoms() is called now at the
   beginning of the step: PHB4_SLOT_CRESET_WAIT_CQ
 - capi: update ci store buffers and dma engines

   The number of read (APC type traffic) and mmio store (MSG type traffic)
   resources assigned to the CAPP is controlled by the CAPP control
   register.

   According to the type of CAPI cards present on the server, we have to
   configure differently the CAPP messages and the DMA read engines given
   to the CAPP for use.

 HMI
 ---
 - core/hmi: Display chip location code while displaying core FIR.
 - core/hmi: Do not display FIR details if none of the bits are set.

   So that we don't flood OPAL console logs with information that is not
   useful.
 - opal/hmi: HMI logging with location code info.

   Add few HMI debug prints with location code info few additional info.

   No functionality change.

   With this patch the log messages will look like: ::

     [210612.175196744,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
     [210612.175200449,7] HMI: [Loc: UOPWR.1302LFA-Node0-Proc1]: P:8 C:16 T:1: TFMR(2d12000870e04020) Timer Facility Error

     [210660.259689526,7] HMI: Received HMI interrupt: HMER = 0x2040000000000000
     [210660.259695649,7] HMI: [Loc: UOPWR.1302LFA-Node0-Proc0]: P:0 C:16 T:1: Processor recovery Done.

 - core/hmi: Use pr_fmt macro for tagging log messages

   No functionality changes.
 - opal: Get chip location code

   and store it under proc_chip for quick reference during HMI handling
   code.

 Sensors
 -------
 - occ-sensors: Fix up quad/gpu location mix-up

   The GPU and QUAD sensor location types are swapped compared to what
   exists in the OCC code base which is authoritative. Fix them up.
 - sensors: occ: Skip counter type of sensors

   Don't add counter type of sensors to device-tree as they don't
   fit into hwmon sensor interface.
 - sensors: dts: Assert special wakeup on idle cores while reading temperature

   In P9, when a core enters a stop state, its clocks will be stopped
   to save power and hence we will not be able to perform a SCOM
   operation to read the DTS temperature sensor.  Hence, assert
   a special wakeup on cores that have entered a stop state in order to
   successfully complete the SCOM operation.
 - sensors: occ: Skip power sensors with zero sample value

   APSS is not available on platforms like Zaius, Romulus where OCC
   can only measure Vdd (core) and Vdn (nest) power from the AVSbus
   reading. So all the sensors for APSS channels will be populated
   with 0. Different component power sensors like system, memory
   which point to the APSS channels will also be 0.

   As per OCC team (Martha Broyles) zeroed power sensor means that the
   system doesn't have it. So this patch filters out these sensors.
 - sensors: occ: Skip GPU sensors for non-gpu systems
 - sensors: Fix dtc warning for new occ in-band sensors.

   dtc complains about missing reg property when a DT node is having a
   unit name or address but no reg property. ::

     /ibm,opal/sensors/vrm-in@c00004 has a unit name, but no reg property
     /ibm,opal/sensors/gpu-in@c0001f has a unit name, but no reg property
     /ibm,opal/sensor-groups/occ-js@1c00040 has a unit name, but no reg property

   This patch fixes these warnings for new occ in-band sensors and also for
   sensor-groups by adding necessary properties.
 - sensors: Fix dtc warning for dts sensors.

   dtc complains about missing reg property when a DT node is having a
   unit name or address but no reg property.

   Example warning for core dts sensor: ::

     /ibm,opal/sensors/core-temp@5c has a unit name, but no reg property
     /ibm,opal/sensors/core-temp@804 has a unit name, but no reg property

   This patch fixes this by adding necessary properties.
 - hw/occ: Fix psr cpu-to-gpu sensors node dtc warning.

   dtc complains about missing reg property when a DT node is having a
   unit name or address but no reg property. ::

     /ibm,opal/power-mgt/psr/cpu-to-gpu@0 has a unit name, but no reg property
     /ibm,opal/power-mgt/psr/cpu-to-gpu@100 has a unit name, but no reg property

   This patch fixes this by adding necessary properties.

 General fixes
 -------------
 - lpc: Clear pending IRQs at boot

   When we come in from hostboot the LPC master has the bus reset indicator
   set. This error isn't handled until the host kernel unmasks interrupts,
   at which point we get the following spurious error: ::

     [   20.053560375,3] LPC: Got LPC reset on chip 0x0 !
     [   20.053564560,3] LPC[000]: Unknown LPC error Error address reg: 0x00000000

   Fix this by clearing the various error bits in the LPC status register
   before we initialise the skiboot LPC bus driver.
 - hw/imc: Check ucode state before exposing units to Linux

   disable_unavailable_units() checks whether the ucode
   is in the running state before enabling the nest units
   in the device tree. From a recent debug, it is found
   that on some system boot, ucode is not loaded and
   running in all the chips in the system. And this
   caused a fail in OPAL_IMC_COUNTERS_STOP call where
   we check for ucode state on each chip. Bug here is
   that disable_unavailable_units() checks the state
   of the ucode only in boot cpu chip. Patch adds a
   condition in disable_unavailable_units() to check
   for the ucode state in all the chip before enabling
   the nest units in the device tree node.

 - hdata/vpd: Add vendor property

   ibm,vpd blob contains VN field. Use that to populate vendor property
   for various FRU's.
 - hdata/vpd: Fix DTC warnings

   All the nodes under the vpd hierarchy have a unit address (their SLCA
   index) but no reg properties. Add them and their size/address cells
   to squash the warnings.
 - HDAT/i2c: Fix SPD EEPROM compatible string

   Hostboot doesn't give us accurate information about the DIMM SPD
   devices. Hack around by assuming any EEPROM we find on the SPD I2C
   master is an SPD EEPROM.
 - hdata/i2c: Fix 512Kb EEPROM size

   There's no such thing as a 412Kb EEPROM.
 - libflash/mbox-flash: fall back to requesting lower MBOX versions from BMC

   Some BMC mbox implementations seem to sometimes mysteriously fail when trying
   to negotiate v3 when they only support v2. To work around this, we
   can fall back to requesting lower mbox protocol versions until we find
   one that works.

   In theory, this should already "just work", but we have a counter example,
   which this patch fixes.
 - IPMI: Fix platform.cec_reboot() null ptr checks

   Kudos to Hugo Landau who reported this in:
   https://github.com/open-power/skiboot/issues/142
 - hdata: Add location code property to xscom node

   This patch adds chip location code property to xscom node.
 - p8-i2c: Limit number of retry attempts

   Current we will attempt to start an I2C transaction until it succeeds.
   In the event that the OCC does not release the lock on an I2C bus this
   results in an async token being held forever and the kernel thread that
   started the transaction will block forever while waiting for an async
   completion message. Fix this by limiting the number of attempts to
   start the transaction.
 - p8-i2c: Don't write the watermark register at init

   On P9 the I2C master is shared with the OCC. Currently the watermark
   values are set once at init time which is bad for two reasons:

   a) We don't take the OCC master lock before setting it. Which
      may cause issues if the OCC is currently using the master.
   b) The OCC might change the watermark levels and we need to reset
      them.

   Change this so that we set the watermark value when a new transaction
   is started rather than at init time.
 - hdata: Rename 'fsp-ipl-side' as 'sp-ipl-side'

   as OPAL is building device tree for both FSP and BMC system.
   Also I don't see anyone using this property today. Hence renaming
   should be fine.
 - hdata/vpd: add support for parsing CPU VRML records

   Allows skiboot to parse out the processor part/serial numbers
   on OpenPOWER P9 machines.
 - core/lock: Introduce atomic cmpxchg and implement try_lock with it

   cmpxchg will be used in a subsequent change, and this reduces the
   amount of asm code.
 - direct-controls: add xscom error handling for p8

   Add xscom checks which will print something useful and return error
   back to callers (which already have error handling plumbed in).
 - direct-controls: p8 implementation of generic direct controls

   This reworks the sreset functionality that was brought over from
   fast-reboot, and fits it under the generic direct controls APIs.

   The fast reboot APIs are implemented using generic direct controls,
   which also makes them available on p9.
 - fast-reboot: allow mambo fast reboot independent of CPU type

   Don't tie mambo fast reboot to POWER8 CPU type.
 - fast-reboot: remove delay after sreset

   There is a 100ms delay when targets reach sreset which does not appear
   to have a good purpose. Remove it and therefore reduce the sreset timeout
   by the same amount.
 - fast-reboot: add more barriers around cpu state changes

   This is a bit of paranoia, but when a CPU changes state to signal it
   has reached a particular point, all previous stores should be visible.
 - fast-reboot: add sreset timeout detection and handling

   Have the initiator wait for all its sreset targets to call in, and
   time out after 200ms if they did not. Fail and revert to IPL reboot.

   Testing indicates that after successful sreset_all_others(), it
   takes less than 102ms (in hundreds of fast reboots) for secondaries
   to call in. 100 of that is due to an initial delay, but core
   un-splitting was not measured.
 - fast-reboot: make spin loops consistent and SMT friendly
 - fast-reboot: add sreset_all_others error handling

   Pass back failures from sreset_all_others, also change return codes to
   OPAL form in sreset_all_prepare to match.

   Errors will revert to the IPL path, so it's not critical to completely
   clean up everything if that would complicate things. Detecting the
   error and failing is the important thing.
 - fast-reboot: restore SMT priority on spin loop exit
 - Add documentation for ibm, firmware-versions device tree node
 - NX: Print read xscom config failures.

   Currently in NX, only write xscom config failures are tracing.
   Add trace statements for read xscom config failures too.
   No functional changes.
 - hw/nx: Fix NX BAR assignments

   The NX rng BAR is used by each core to source random numbers for the
   DARN instruction. Currently we configure each core to use the NX rng of
   the chip that it exists on. Unfortunately, the NX can be de-configured by
   hostboot and in this case we need to use the NX of a different chip.

   This patch moves the BAR assignments for the NX into the normal nx-rng
   init path. This lets us check if the normal (chip local) NX is active
   when configuring which NX a core should use so that we can fall back
   gracefully.
 - FSP-elog: Reduce verbosity of elog messages

   These messages just fill up the opal console log with useless messages
   resulting in us losing useful information.

   They have been like this since the first commit in skiboot. Make them
   trace.
 - core/bitmap: fix bitmap iteration limit corruption

   The bitmap iterators did not reduce the number of bits to scan
   when searching for the next bit, which would result in them
   overrunning their bitmap.

   These are only used in one place, in xive reset, and the effect
   is that the xive reset code will keep zeroing memory until it
   reaches a block of memory of MAX_EQ_COUNT >> 3 bits in length,
   all zeroes.
 - hw/imc: always enable "imc_nest_chip" exports property

   imc_dt_update_nest_node() adds a "imc_nest_chip" property
   to the "exports" node (under opal_node) to view nest counter
   region. This comes handy when debugging ucode runtime
   errors (like counter data update or control block update
   so on...). And current code enables the property only if
   the microcode is in running state at system boot. To aid
   the debug of ucode not running/starting issues at boot,
   enable the addition of "imc_nest_chip" property always.

 NVLINK2
 -------

 - npu2-hw-procedures.c: Correct phy lane mapping

   Each NVLINK2 device is associated with a particular group of OBUS lanes via
   a lane mask which is read from HDAT via the device-tree. However Skiboot's
   interpretation of lane mask was different to what is exported from the
   HDAT.

   Specifically the lane mask bits in the HDAT are encoded in IBM bit ordering
   for a 24-bit wide value. So for example in normal bit ordering lane-0 is
   represented by having lane-mask bit 23 set and lane-23 is represented by
   lane-mask bit 0. This patch alters the Skiboot interpretation to match what
   is passed from HDAT.

 - npu2-hw-procedures.c: Power up lanes during ntl reset

   Newer versions of Hostboot will not power up the NVLINK2 PHY lanes by
   default. The phy_reset procedure already powers up the lanes but they also
   need to be powered up in order to access the DL.

   The reset_ntl procedure is called by the device driver to bring the DL out
   of reset and get it into a working state. Therefore we also need to add
   lane and clock power up to the reset_ntl procedure.
 - npu2.c: Add PE error detection

   Invalid accesses from the GPU can cause a specific PE to be frozen by the
   NPU. Add an interrupt handler which reports the frozen PE to the operating
   system via as an EEH event.
 - npu2.c: Fix XIVE IRQ alignment
 - npu2: hw-procedures: Refactor reset_ntl procedure

   Change the implementation of reset_ntl to match the latest programming
   guide documentation.
 - npu2: hw-procedures: Add phy_rx_clock_sel()

   Change the RX clk mux control to be done by software instead of HW. This
   avoids glitches caused by changing the mux setting.
 - npu2: hw-procedures: Change phy_rx_clock_sel values

   The clock selection bits we set here are inputs to a state machine.

   DL clock select (bits 30-31)

   0b00
     lane 0 clock
   0b01
     lane 7 clock
   0b10
     grid clock
   0b11
     invalid/no-op

   To recover from a potential glitch, we need to ensure that the value we
   set forces a state change. Our current sequence is to set 0x3 followed
   by 0x1. With the above now known, that is actually a no-op followed by
   selection of lane 7. Depending on lane reversal, that selection is not a
   state change for some bricks.

   The way to force a state change in all cases is to switch to the grid
   clock, and then back to a lane.
 - npu2: hw-procedures: Manipulate IOVALID during training

   Ensure that the IOVALID bit for this brick is raised at the start of
   link training, in the reset_ntl procedure.

   Then, to protect us from a glitch when the PHY clock turns off or gets
   chopped, lower IOVALID for the duration of the phy_reset and
   phy_rx_dccal procedures.
 - npu2: hw-procedures: Add check_credits procedure

   As an immediate mitigation for a current hardware glitch, add a procedure
   that can be used to validate NTL credit values. This will be called as a
   safeguard to check that link training succeeded.

   Assert that things are exactly as we expect, because if they aren't, the
   system will experience a catastrophic failure shortly after the start of
   link traffic.
 - npu2: Print bdfn in NPU2DEV* logging macros

   Revise the NPU2DEV{DBG,INF,ERR} logging macros to include the device's
   bdfn. It's useful to know exactly which link we're referring to.

     For instance, instead of ::

       [  234.044921238,6] NPU6: Starting procedure reset_ntl
       [  234.048578101,6] NPU6: Starting procedure reset_ntl
       [  234.051049676,6] NPU6: Starting procedure reset_ntl
       [  234.053503542,6] NPU6: Starting procedure reset_ntl
       [  234.057182864,6] NPU6: Starting procedure reset_ntl
       [  234.059666137,6] NPU6: Starting procedure reset_ntl

     we'll get ::

       [  234.044921238,6] NPU6:0:0.0 Starting procedure reset_ntl
       [  234.048578101,6] NPU6:0:0.1 Starting procedure reset_ntl
       [  234.051049676,6] NPU6:0:0.2 Starting procedure reset_ntl
       [  234.053503542,6] NPU6:0:1.0 Starting procedure reset_ntl
       [  234.057182864,6] NPU6:0:1.1 Starting procedure reset_ntl
       [  234.059666137,6] NPU6:0:1.2 Starting procedure reset_ntl
 - npu2: Move to new GPU memory map

   There are three different ways we configure the MCD and memory map.

   1) Old way (current way)
        Skiboot configures the MCD and puts GPUs at 4TB and below
   2) New way with MCD
        Hostboot configures the MCD and skiboot puts GPU at 4TB and above
   3) New way without MCD
        No one configures the MCD and skiboot puts GPU at 4TB and below

   The patch keeps option 1 and adds options 2 and 3.

   The different configurations are detected using certain scoms (see
   patch).

   Option 1 will go away eventually as it's a configuration that can
   cause xstops or data integrity problems. We are keeping it around to
   support existing hostboot.

   Option 2 supports only 4 GPUs and 512GB of memory per socket.

   Option 3 supports 6 GPUs and 4TB of memory but may have some
   performance impact.
 - phys-map: Rename GPU_MEM to GPU_MEM_4T_DOWN

   This map is soon to be replaced, but we are going to keep it around
   for a little while so that we support older hostboot firmware.

 Platform Specific Fixes
 -----------------------

 Witherspoon
 ^^^^^^^^^^^
 - Witherspoon: Remove old Witherspoon platform definition

   An old Witherspoon platform definition was added to aid the transition from
   versions of Hostboot which didn't have the correct NVLINK2 HDAT information
   available and/or planar VPD. These system should now be updated so remove
   the possibly incorrect default assumption.

   This may disable NVLINK2 on old out-dated systems but it can easily be
   restored with the appropriate FW and/or VPD updates. In any case there is a
   a 50% chance the existing default behaviour was incorrect as it only
   supports 6 GPU systems. Using an incorrect platform definition leads to
   undefined behaviour which is more difficult to detect/debug than not
   creating the NVLINK2 devices so remove the possibly incorrect default
   behaviour.
 - Witherspoon: Fix VPD EEPROM type

   There are user-space tools that update the planar VPD via the sysfs
   interface. Currently we do not get correct information from hostboot
   about the exact type of the EEPROM so we need to manually fix it up
   here. This needs to be done as a platform specific fix since there is
   not standardised VPD EEPROM type.

 IBM FSP Systems
 ^^^^^^^^^^^^^^^

 - nvram: Fix 'missing' nvram on FSP systems.

   commit ba4d46fdd9eb ("console: Set log level from nvram") wants to read
   from NVRAM rather early. This works fine on BMC based systems as
   nvram_init() is actually synchronous. This is not true for FSP systems
   and it turns out that the query for the console log level simply
   queries blank nvram.

   The simple fix is to wait for the NVRAM read to complete before
   performing any query. Unfortunately it turns out that the fsp-nvram
   code does not inform the generic NVRAM layer when the read is complete,
   rather, it must be prompted to do so.

   This patch addresses both these problems. This patch adds a check before
   the first read of the NVRAM (for the console log level) that the read
   has completed. The fsp-nvram code has been updated to inform the generic
   layer as soon as the read completes.

   The old prompt to the fsp-nvram code has been removed but a check to
   ensure that the NVRAM has been loaded remains. It is conservative but
   if the NVRAM is not done loading before the host is booted it will not
   have an nvram device-tree node which means it won't be able to access
   the NVRAM at all, ever, even after the NVRAM has loaded.


 Utilities
 ----------

 - Fix xscom-utils distclean target

   In Debian/Ubuntu, the packaging system likes to have a full clean-up that
   restores the tree back to original one, so add some files to the distclean
   target.
 - Add man pages for xscom-utils and pflash

   For the need of Debian/Ubuntu packaging, I inferred some initial man
   pages from their help output.

 gard
 ^^^^
 - gard: Add tests

   I hear Stewart likes these for some reason. Dunno why.
 - gard: Add OpenBMC vPNOR support

   A big-ol-hack to add some checking for OpenBMC's vPNOR GUARD files under
   /media/pnor-prsv. This isn't ideal since it doesn't handle the create
   case well, but it's better than nothing.
 - gard: Always use MTD to access flash

   Direct mode is generally either unsafe or unsupported. We should always
   access the PNOR via an MTD device so make that the default. If someone
   really needs direct mode, then they can use pflash.
 - gard: Fix up do_create return values

   The return value of a subcommand is interpreted as a libflash error code
   when it's positive or some subcommand specific error when negative.
   Currently the create subcommand always returns zero when exiting (even
   for errors) so fix that.
 - gard: Add usage message for -p

   The -p argument only really makes sense when -f is specified. Print an
   actual error message rather than just the usage blob.
 - gard: Fix max instance count

   There's an entire byte for the instance count rather than a nibble. Only
   barf if the instance number is beyond 255 rather than 16.
 - gard: Fix up path parsing

   Currently we assume that the Unit ID can be used as an array index into
   the chip_units[] structure. There are holes in the ID space though, so
   this doesn't actually work. Fix it up by walking the array looking for
   the ID.
 - gard: Set chip generation based on PVR

   Currently we assume that this tool is being used on a P8 system by
   default and allow the user to override this behaviour using the -8 and
   -9 command line arguments. When running on the host we can use the
   PVR to guess what chip generation so do that.

   This also changes the default behaviour to assume that the host is a P9
   when running on an ARM system. This tool didn't even work when compiled
   for ARM until recently and the OpenBMC vPNOR hack that we have currently
   is broken for P9 systems that don't use vPNOR (Zaius and Romulus).
 - gard: Allow records with an ID of 0xffffffff

   We currently assume that a record with an ID of 0xffffffff is invalid.
   Apparently this is incorrect and we should display these records, so
   expand the check to compare the entire record with 0xff rather than
   just the ID.
 - gard: create: Allow creating arbitrary GARD records

   Add a new sub-command that allows us to create GARD records for
   arbitrary chip units. There isn't a whole lot of constraints on this and
   that limits how useful it can be, but it does allow a user to GARD out
   individual DIMMs, chips or cores from the BMC (or host) if needed.

   There are a few caveats though:

   1) Not everything can, or should, have a GARD record applied it to.
   2) There is no validation that the unit actually exists. Doing that
      sort of validation requires something that understands the FAPI
      targeting information (I think) and adding support for it here
      would require some knowledge from the system XML file.
   3) There's no way to get a list of paths in the system.
   4) Although we can create a GARD record at runtime it won't be applied
      until the next IPL.
 - gard: Add path parsing support

   In order to support manual GARD records we need to be able to parse the
   hardware unit path strings. This patch implements that.
 - gard: list: Improve output

   Display the full path to the GARDed hardware unit in each record rather
   than relying on the output of `gard show` and convert do_list() to use
   the iterator while we're here.
 - gard: {list, show}: Fix the Type field in the output

   The output of `gard list` has a field named "Type", however this
   doesn't actually indicate the type of the record. Rather, it
   shows the type of the path used to identify the hardware being
   GARDed. This is of pretty dubious value considering the Physical
   path seems to always be used when referring to GARDed hardware.
 - gard: Add P9 support
 - gard: Update chip unit data

   Source the list of units from the hostboot source rather than the
   previous hard coded list. The list of path element types changes
   between generations so we need to add a level of indirection to
   accommodate P9. This also changes the names used to match those
   printed by Hostboot at IPL time and paves the way to adding support
   for manual GARD record creation.
 - gard: show: Remove "Res Recovery" field

   This field has never been populated by hostboot on OpenPower systems
   so there's no real point in reporting it's contents.

 libflash / pflash
 ^^^^^^^^^^^^^^^^^

 Anybody shipping libflash or pflash to interact with POWER9 systems must
 upgrade to this version.

 - pflash: Support for volatile flag

   The volatile flag was added to the PNOR image to
   indicate partitions that are cleared during a host
   power off. Display this flag from the pflash command.
 - pflash: Support for clean_on_ecc_error flag

   Add the misc flag clear_on_ecc_error to libflash/pflash. This was
   the only missing flag. The generator of the virtual PNOR image
   relies on libflash/pflash to provide the partition information,
   so all flags are needed to build an accurate virtual PNOR partition
   table.
 - pflash: Respect write(2) return values

   The write(2) system call returns the number of bytes written, this is
   important since it is entitled to write less than what we requested.
   Currently we ignore the return value and assume it wrote everything we
   requested. While in practice this is likely to always be the case, it
   isn't actually correct.
 - external/pflash: Fix erasing within a single erase block

   It is possible to erase within a single erase block. Currently the
   pflash code assumes that if the erase starts part way into an erase
   block it is because it needs to be aligned up to the boundary with the
   next erase block.

   Doing an erase smaller than a single erase block will cause underflows
   and looping forever on erase.
 - external/pflash: Fix non-zero return code for successful read when size%256 != 0

   When performing a read the return value from pflash is non-zero, even for
   a successful read, when the size being read is not a multiple of 256.
   This is because do_read_file returns the value from the write system
   call which is then returned by pflash. When the size is a multiple of
   256 we get lucky in that this wraps around back to zero. However for any
   other value the return code is size % 256. This means even when the
   operation is successful the return code will seem to reflect an error.

   Fix this by returning zero if the entire size was read correctly,
   otherwise return the corresponding error code.
 - libflash: Fix parity calculation on ARM

   To calculate the ECC syndrome we need to calculate the parity of a 64bit
   number. On non-powerpc platforms we use the GCC builtin function
   __builtin_parityl() to do this calculation. This is broken on 32bit ARM
   where sizeof(unsigned long) is four bytes. Using __builtin_parityll()
   instead cures this.
 - libflash/mbox-flash: Add the ability to lock flash
 - libflash/mbox-flash: Understand v3
 - libflash/mbox-flash: Use BMC suggested timeout value
 - libflash/mbox-flash: Simplify message sending

   hw/lpc-mbox no longer requires that the memory associated with messages
   exist for the lifetime of the message. Once it has been sent to the BMC,
   that is bmc_mbox_enqueue() returns, lpc-mbox does not need the message
   to continue to exist. On the receiving side, lpc-mbox will ensure that a
   message exists for the receiving callback function.

   Remove all code to deal with allocating messages.
 - hw/lpc-mbox: Simplify message bookkeeping and timeouts

   Currently the hw/lpc-mbox layer keeps a pointer for the currently
   in-flight message for the duration of the mbox call. This creates
   problems when messages timeout, is that pointer still valid, what can we
   do with it. The memory is owned by the caller but if the caller has
   declared a timeout, it may have freed that memory.

   Another problem is locking. This patch also locks around sending and
   receiving to avoid races with timeouts and possible resends. There was
   some locking previously which was likely insufficient - definitely too
   hard to be sure is correct

   All this is made much easier with the previous rework which moves
   sequence number allocation and verification into lpc-mbox rather than
   the caller.
 - libflash/mbox-flash: Allow mbox-flash to tell the driver msg timeouts

   Currently when mbox-flash decides that a message times out the driver
   has no way of knowing to drop the message and will continue waiting for
   a response indefinitely preventing more messages from ever being sent.

   This is a problem if the BMC crashes or has some other issue where it
   won't ever respond to our outstanding message.

   This patch provides a method for mbox-flash to tell the driver how long
   it should wait before it no longer needs to care about the response.
 - libflash/mbox-flash: Move sequence handling to driver level
 - libflash/mbox-flash: Always close windows before opening a new window

   The MBOX protocol states that if an open window command fails then all
   open windows are closed. Currently, if an open window command fails
   mbox-flash will erroneously assume that the previously open window is
   still open.

   The solution to this is to mark all windows as closed before issuing an
   open window command and then on success we'll mark the new window as
   open.
 - libflash/mbox-flash: Add v2 error codes

 opal-prd
 ^^^^^^^^

 Anybody shipping `opal-prd` for POWER9 systems must upgrade `opal-prd` to
 this new version.

 - prd: Log unsupported message type

   Useful for debugging.

   Sample output: ::

       [29155.157050283,7] PRD: Unsupported prd message type : 0xc

 - opal-prd: occ: Add support for runtime OCC load/start in ZZ

   This patch adds support to handle OCC load/start event from FSP/PRD.
   During IPL we send a success directly to FSP without invoking any HBRT
   load routines on receiving OCC load mbox message from FSP. At runtime
   we forward this event to host opal-prd.

   This patch provides support for invoking OCC load/start HBRT routines
   like load_pm_complex() and start_pm_complex() from opal-prd.
 - opal-prd: Add support for runtime OCC reset in ZZ

   This patch handles OCC_RESET runtime events in host opal-prd and also
   provides support for calling 'hostinterface->wakeup()' which is
   required for doing the reset operation.
 - prd: Enable error logging via firmware_request interface

   In P9 HBRT sends error logs to FSP via firmware_request interface.
   This patch adds support to parse error log and send it to FSP.
 - prd: Add generic response structure inside prd_fw_msg

   This patch adds generic response structure. Also sync prd_fw_msg type
   macros with hostboot.
 - opal-prd: flush after logging to stdio in debug mode

   When in debug mode, flush after each log output. This makes it more
   likely that we'll catch failure reasons on severe errors.

 Debugging and reliability improvements
 --------------------------------------

 - lock: Add additional lock auditing code

   Keep track of lock owner name and replace lock_depth counter
   with a per-cpu list of locks held by the cpu.

   This allows us to print the actual locks held in case we hit
   the (in)famous message about opal_pollers being run with a
   lock held.

   It also allows us to warn (and drop them) if locks are still
   held when returning to the OS or completing a scheduled job.
 - Add support for new GCC 7 parametrized stack protector

   This gives us per-cpu guard values as well. For now I just
   XOR a magic constant with the CPU PIR value.
 - Mambo: run hello_world and sreset_world tests with Secure and Trusted Boot

   We *disable* the secure boot part, but we keep the verified boot
   part as we don't currently have container verification code for Mambo.

   We can run a small part of the code currently though.

 - core/flash.c: extern function to get the name of a PNOR partition

   This adds the flash_map_resource_name() to allow skiboot subsystems to
   lookup the name of a PNOR partition. Thus, we don't need to duplicate
   the same information in other places (e.g. libstb).
 - libflash/mbox-flash: only wait for MBOX_DEFAULT_POLL_MS if busy

   This makes the mbox unit test run 300x quicker and seems to
   shave about 6 seconds from boot time on Witherspoon.
 - make check: Make valgrind optional

   To (slightly) lower the barrier for contributions, we can make valgrind
   optional with just a small amount of plumbing.

   This allows make check to run successfully without valgrind.
 - libflash/test: Add tests for mbox-flash

   A first basic set of tests for mbox-flash. These tests do their testing
   by stubbing out or otherwise replacing functions not in
   libflash/mbox-flash.c. The stubbed out version of the function can then
   be used to emulate a BMC mbox daemon talking to back to the code in
   mbox-flash and it can ensure that there is some adherence to the
   protocol and that from a block-level api point of view the world appears
   sane.

   This makes these tests simple to run and they have been integrated into
   `make check`. The down side is that these tests rely on duplicated
   feature incomplete BMC daemon behaviour. Therefore these tests are a
   strong indicator of broken behaviour but a very unreliable indicator of
   correctness.

   Full integration tests with a 'real' BMC daemon are probably beyond the
   scope of this repository.
 - external/test/test.sh: fix VERSION substitution when no tags

   i.e. we get a hash rather than a version number

   This seems to be occurring in Travis if it doesn't pull a tag.
 - external/test: make stripping out version number more robust

   For some bizarre reason, Travis started failing on this
   substitution when there'd been zero code changes in this
   area... This at least papers over whatever the problem is
   for the time being.
 - io: Add load_wait() helper

   This uses the standard form twi/isync pair to ensure a load
   is consumed by the core before continuing. This can be necessary
   under some circumstances for example when having the following
   sequence:

   - Store reg A
   - Load reg A (ensure above store pushed out)
   - delay loop
   - Store reg A

   I.E., a mandatory delay between 2 stores. In theory the first store
   is only guaranteed to reach the device after the load from the same
   location has completed. However the processor will start executing
   the delay loop without waiting for the return value from the load.

   This construct enforces that the delay loop isn't executed until
   the load value has been returned.
 - chiptod: Keep boot timestamps contiguous

   Currently we reset the timebase value to (almost) zero when
   synchronising the timebase of each chip to the Chip TOD network which
   results in this: ::

     [   42.374813167,5] CPU: All 80 processors called in...
     [    2.222791151,5] FLASH: Found system flash: Macronix MXxxL51235F id:0
     [    2.222977933,5] BT: Interface initialized, IO 0x00e4

   This patch modifies the chiptod_init() process to use the current
   timebase value rather than resetting it to zero. This results in the
   timestamps remaining contiguous from the start of hostboot until
   the petikernel starts. e.g. ::

     [   70.188811484,5] CPU: All 144 processors called in...
     [   72.458004252,5] FLASH: Found system flash:  id:0
     [   72.458147358,5] BT: Interface initialized, IO 0x00e4

 - hdata/spira: Add missing newline to prlog() call

   We're missing a \n here.
 - opal/xscom: Add recovery for lost core wakeup SCOM failures.

   Due to a hardware issue where core responding to SCOM was delayed due to
   thread reconfiguration, leaves the SCOM logic in a state where the
   subsequent SCOM to that core can get errors. This is affected for Core
   PC SCOM registers in the range of 20010A80-20010ABF

   The solution is if a xscom timeout occurs to one of Core PC SCOM registers
   in the range of 20010A80-20010ABF, a clearing SCOM write is done to
   0x20010800 with data of '0x00000000' which will also get a timeout but
   clears the SCOM logic errors. After the clearing write is done the original
   SCOM operation can be retried.

   The SCOM timeout is reported as status 0x4 (Invalid address) in HMER[21-23].
 - opal/xscom: Move the delay inside xscom_reset() function.

   So caller of xscom_reset() does not have to bother about adding a delay
   separately. Instead caller can control whether to add a delay or not using
   second argument to xscom_reset().
 - timer: Stop calling list_top() racily

   This will trip the debug checks in debug builds under some circumstances
   and is actually a rather bad idea as we might look at a timer that is
   concurrently being removed and modified, and thus incorrectly assume
   there is no work to do.
 - fsp: Bail out of HIR if FSP is resetting voluntarily

   a. Surveillance response times out and OPAL triggers a HIR
   b. Before the HIR process kicks in, OPAL gets a PSI interrupt indicating link down
   c. HIR process continues and OPAL tries to write to DRCR; PSI link inactive => xstop

   OPAL should confirm that the FSP is not already in reset in the HIR path.
 - sreset_kernel: only run SMT tests due to not supporting re-entry
 - Use systemsim-p9 v1.1
 - direct-controls: enable fast reboot direct controls for mambo

   Add mambo direct controls to stop threads, which is required for
   reliable fast-reboot. Enable direct controls by default on mambo.
 - core/opal: always verify cpu->pir on entry
 - asm/head: add entry/exit calls

   Add entry and exit C functions that can do some more complex
   checks before the opal proper call. This requires saving off
   volatile registers that have arguments in them.
 - core/lock: improve bust_locks

   Prevent try_lock from modifying the lock state when bust_locks is set.
   unlock will not unlock it in that case, so locks will get taken and
   never released while bust_locks is set.
 - hw/occ: Log proper SCOM register names

   This patch fixes the logging of incorrect SCOM
   register names.
 - mambo: Add support for NUMA

   Currently the mambo scripts can do multiple chips, but only the first
   ever has memory.

   This patch adds support for having memory on each chip, with each
   appearing as a separate NUMA node. Each node gets MEM_SIZE worth of
   memory.

   It's opt-in, via ``export MAMBO_NUMA=1``.
 - external/mambo: Switch qtrace command to use plug-ins

   The plug-in seems to be the preferred way to do this now, it works
   better, and the qtracer emitter seems to generate invalid traces
   in new mambo versions.
 - asm/head: Loop after attn

   We use the attn instruction to raise an error in early boot if OPAL
   don't recognise the PVR. It's possible for hostboot to disable the
   attn instruction before entering OPAL so add an extra busy loop after
   the attn to prevent attempting to boot on an unknown processor.