| .. _skiboot-6.3-rc3: |
| |
| skiboot-6.3-rc3 |
| =============== |
| |
| skiboot v6.3-rc3 was released on Thursday May 2nd 2019. It is the third |
| release candidate of skiboot 6.3, which will become the new stable release |
| of skiboot following the 6.2 release, first released December 14th 2018. |
| |
| Skiboot 6.3 will mark the basis for op-build v2.3. I expect to tag the final |
| skiboot 6.3 in the next week (I also predicted this last time, so take my |
| predictions with a large amount of sodium). |
| |
| skiboot v6.3-rc3 contains all bug fixes as of :ref:`skiboot-6.0.19`, |
| and :ref:`skiboot-6.2.3` (the currently maintained |
| stable releases). |
| |
| For how the skiboot stable releases work, see :ref:`stable-rules` for details. |
| |
| Over :ref:`skiboot-6.3-rc2`, we have the following changes: |
| |
| |
| - Expose PNOR Flash partitions to host MTD driver via devicetree |
| |
| This makes it possible for the host to directly address each |
| partition without requiring each application to directly parse |
| the FFS headers. This has been in use for some time already to |
| allow BOOTKERNFW partition updates from the host. |
| |
| All partitions except BOOTKERNFW are marked readonly. |
| |
| The BOOTKERNFW partition is currently exclusively used by the TalosII platform |
| |
| - Write boot progress to LPC port 80h |
| |
| This is an adaptation of what we currently do for op_display() on FSP |
| machines, inventing an encoding for what we can write into the single |
| byte at LPC port 80h. |
| |
| Port 80h is often used on x86 systems to indicate boot progress/status |
| and dates back a decent amount of time. Since a byte isn't exactly very |
| expressive for everything that can go on (and wrong) during boot, it's |
| all about compromise. |
| |
| Some systems (such as Zaius/Barreleye G2) have a physical dual 7 segment |
| display that display these codes. So far, this has only been driven by |
| hostboot (see hostboot commit 90ec2e65314c). |
| |
| - Write boot progress to LPC ports 81 and 82 |
| |
| There's a thought to write more extensive boot progress codes to LPC |
| ports 81 and 82 to supplement/replace any reliance on port 80. |
| |
| We want to still emit port 80 for platforms like Zaius and Barreleye |
| that have the physical display. Ports 81 and 82 can be monitored by a |
| BMC though. |
| |
| - Copy and convert Romulus descriptors to Talos |
| |
| Talos II has some hardware differences from Romulus, therefore |
| we cannot guarantee Talos II == Romulus in skiboot. Copy and |
| slightly modify the Romulus files for Talos II. |
| |
| - npu2: Disable Probe-to-Invalid-Return-Modified-or-Owned snarfing by default |
| |
| V100 GPUs are known to violate NVLink2 protocol in some cases (one is when |
| memory was accessed by the CPU and they by GPU using so called block |
| linear mapping) and issue double probes to NPU which can cope with this |
| problem only if CONFIG_ENABLE_SNARF_CPM ("disable/enable Probe.I.MO |
| snarfing a cp_m") is not set in the CQ_SM Misc Config register #0. |
| If the bit is set (which is the case today), NPU issues the machine |
| check stop. |
| |
| The snarfing feature is designed to detect 2 probes in flight and combine |
| them into one. |
| |
| This adds a new "opal-npu2-snarf-cpm" nvram variable which controls |
| CONFIG_ENABLE_SNARF_CPM for all NVLinks to prevent the machine check |
| stop from happening. |
| |
| This disables snarfing by default as otherwise a broken GPU driver can |
| crash the entire box even when a GPU is passed through to a guest. |
| This provides a dial to allow regression tests (might be useful for |
| a bare metal). To enable snarfing, the user needs to run: :: |
| |
| sudo nvram -p ibm,skiboot --update-config opal-npu2-snarf-cpm=enable |
| |
| and reboot the host system. |
| |
| - hw/npu2: Show name of opencapi error interrupts |
| - core/pci: Use PHB io-base-location by default for PHB slots |
| |
| On witherspoon only the GPU slots and the three pluggable PCI slots |
| (SLOT0, 1, 2) have platform defined slot names. For builtin devices such |
| as the SATA controller or the PLX switch that fans out to the GPU slots |
| we have no location codes which some people consider an issue. |
| |
| This patch address the problem by making the ibm,slot-location-code for |
| the root port device default to the ibm,io-base-location-code which is |
| typically the location code for the system itself. |
| |
| e.g. :: |
| |
| pciex@600c3c0100000/ibm,loc-code |
| "UOPWR.0000000-Node0-Proc0" |
| |
| pciex@600c3c0100000/pci@0/ibm,loc-code |
| "UOPWR.0000000-Node0-Proc0" |
| |
| pciex@600c3c0100000/pci@0/usb-xhci@0/ibm,loc-code |
| "UOPWR.0000000-Node0" |
| |
| The PHB node, and the root complex nodes have a loc code of the |
| processor they are attached to, while the usb-xhci device under the |
| root port has a location code of the system itself. |
| |
| - hw/phb4: Read ibm,loc-code from PBCQ node |
| |
| On P9 the PBCQs are subdivided by stacks which implement the PCI Express |
| logic. When phb4 was forked from phb3 most of the properties that were |
| in the pbcq node moved into the stack node, but ibm,loc-code was not one |
| of them. This patch fixes the phb4 init sequence to read the base |
| location code from the PBCQ node (parent of the stack node) rather than |
| the stack node itself. |
| - hw/xscom: add missing P9P chip name |
| - asm/head: balance branches to avoid link stack predictor mispredicts |
| |
| The Linux wrapper for OPAL call and return is arranged like this: :: |
| |
| __opal_call: |
| mflr r0 |
| std r0,PPC_STK_LROFF(r1) |
| LOAD_REG_ADDR(r11, opal_return) |
| mtlr r11 |
| hrfid -> OPAL |
| |
| opal_return: |
| ld r0,PPC_STK_LROFF(r1) |
| mtlr r0 |
| blr |
| |
| When skiboot returns to Linux, it branches to LR (i.e., opal_return) |
| with a blr. This unbalances the link stack predictor and will cause |
| mispredicts back up the return stack. |
| - external/mambo: also invoke readline for the non-autorun case |
| - asm/head.S: set POWER9 radix HID bit at entry |
| |
| When running in virtual memory mode, the radix MMU hid bit should not |
| be changed, so set this in the initial boot SPR setup. |
| |
| As a side effect, fast reboot also has HID0:RADIX bit set by the |
| shared spr init, so no need for an explicit call. |
| - opal-prd: Fix memory leak in is-fsp-system check |
| - opal-prd: Check malloc return value |
| - hw/phb4: Squash the IO bridge window |
| |
| The PCI-PCI bridge spec says that bridges that implement an IO window |
| should hardcode the IO base and limit registers to zero. |
| Unfortunately, these registers only define the upper bits of the IO |
| window and the low bits are assumed to be 0 for the base and 1 for the |
| limit address. As a result, setting both to zero can be mis-interpreted |
| as a 4K IO window. |
| |
| This patch fixes the problem the same way PHB3 does. It sets the IO base |
| and limit values to 0xf000 and 0x1000 respectively which most software |
| interprets as a disabled window. |
| |
| lspci before patch: :: |
| |
| 0000:00:00.0 PCI bridge: IBM Device 04c1 (prog-if 00 [Normal decode]) |
| I/O behind bridge: 00000000-00000fff |
| |
| lspci after patch: :: |
| |
| 0000:00:00.0 PCI bridge: IBM Device 04c1 (prog-if 00 [Normal decode]) |
| I/O behind bridge: None |
| |
| - build: link with --orphan-handling=warn |
| |
| The linker can warn when the linker script does not explicitly place |
| all sections. These orphan sections are placed according to |
| heuristics, which may not always be desirable. Enable this warning. |
| - build: -fno-asynchronous-unwind-tables |
| |
| skiboot does not use unwind tables, this option saves about 100kB, |
| mostly from .text. |
| - hw/xscom: Enable sw xstop by default on p9 |
| |
| This was disabled at some point during bringup to make life easier for |
| the lab folks trying to debug NVLink issues. This hack really should |
| have never made it out into the wild though, so we now have the |
| following situation occuring in the field: |
| |
| 1) A bad happens |
| 2) The host kernel recieves an unrecoverable HMI and calls into OPAL to |
| request a platform reboot. |
| 3) OPAL rejects the reboot attempt and returns to the kernel with |
| OPAL_PARAMETER. |
| 4) Kernel panics and attempts to kexec into a kdump kernel. |
| |
| A side effect of the HMI seems to be CPUs becoming stuck which results |
| in the initialisation of the kdump kernel taking a extremely long time |
| (6+ hours). It's also been observed that after performing a dump the |
| kdump kernel then crashes itself because OPAL has ended up in a bad |
| state as a side effect of the HMI. |
| |
| All up, it's not very good so re-enable the software checkstop by |
| default. If people still want to turn it off they can using the nvram |
| override. |
| - opal/hmi: Initialize the hmi event with old value of TFMR. |
| |
| Do this before we fix TFAC errors. Otherwise the event at host console |
| shows no thread error reported in TFMR register. |
| |
| Without this patch the console event show TFMR with no thread error: |
| (DEC parity error TFMR[59] injection) :: |
| |
| [ 53.737572] Severe Hypervisor Maintenance interrupt [Recovered] |
| [ 53.737596] Error detail: Timer facility experienced an error |
| [ 53.737611] HMER: 0840000000000000 |
| [ 53.737621] TFMR: 3212000870e04000 |
| |
| After this patch it shows old TFMR value on host console: :: |
| |
| [ 2302.267271] Severe Hypervisor Maintenance interrupt [Recovered] |
| [ 2302.267305] Error detail: Timer facility experienced an error |
| [ 2302.267320] HMER: 0840000000000000 |
| [ 2302.267330] TFMR: 3212000870e14010 |