commit | 316f99bdb4e0911c2d3970a8ca23f30101dba57a | [log] [tgz] |
---|---|---|
author | Vasant Hegde <hegdevasant@linux.vnet.ibm.com> | Fri Jun 09 22:49:05 2017 +0530 |
committer | Stewart Smith <stewart@linux.vnet.ibm.com> | Wed Jun 14 15:58:19 2017 +1000 |
tree | 6ed41b1d0d7c997fcf8ee35755c66ba6a3e7ad12 | |
parent | c7e1d072cc18088a5ba12779f5f11b97bb886723 [diff] |
FSP/CONSOLE: Workaround for unresponsive ipmi daemon We use TCE mapped area to write data to console. Console header (fsp_serbuf_hdr) is modified by both FSP and OPAL (OPAL updates next_in pointer in fsp_serbuf_hdr and FSP updates next_out pointer). Kernel makes opal_console_write() OPAL call to write data to console. OPAL write data to TCE mapped area and sends MBOX command to FSP. If our console becomes full and we have data to write to console, we keep on waiting until FSP reads data. In some corner cases, where FSP is active but not responding to console MBOX message (due to buggy IPMI) and we have heavy console write happening from kernel, then eventually our console buffer becomes full. At this point OPAL starts sending OPAL_BUSY_EVENT to kernel. Kernel will keep on retrying. This is creating kernel soft lockups. In some extreme case when every CPU is trying to write to console, user will not be able to ssh and thinks system is hang. If we reset FSP or restart IPMI daemon on FSP, system recovers and everything becomes normal. This patch adds workaround to above issue by returning OPAL_HARDWARE when cosole is full. Side effect of this patch is, we may endup dropping latest console data. But better to drop console data than system hang. Alternative approach is to drop old data from console buffer, make space for new data. But in normal condition only FSP can update 'next_out' pointer and if we touch that pointer, it may introduce some other race conditions. Hence we decided to just new console write request. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> (cherry picked from commit c8a7535f3539c79955645e6b3714b367a994b1e9) Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Firmware for OpenPower systems.
Source: https://github.com/open-power/skiboot
Mailing list: skiboot@lists.ozlabs.org
Info/subscribe: https://lists.ozlabs.org/listinfo/skiboot
Archives: https://lists.ozlabs.org/pipermail/skiboot/
Patchwork: http://patchwork.ozlabs.org/project/skiboot/list/
OPAL firmware (OpenPower Abstraction Layer) comes in several parts.
A simplified flow of what happens when the power button is pressed is:
Here, the OPAL image is three parts:
They may be all part of one payload or three separate images (depending on platform).
The bootloader will kexec a host kernel (probably linux). The host OS can make OPAL calls. The OPAL API is documented in doc/opal-api/ (there are missing parts, patches are welcome!)
See doc/overview.txt for a more in depth overview of skiboot.
You can build on a linux host. Modern Debian and Ubuntu are well known to be suitable. Build and testing on x86 is fine. You do not need a POWER host to build and test skiboot.
You will need a C compiler for big endian ppc64. If your distro does not provide one, crosstool built compilers work well: https://www.kernel.org/pub/tools/crosstool/
You should then be able to just (where 4=nr cpu cores of your machine)
make -j4 make -j4 check
If using crosstool compilers, add /opt/cross/gcc-4.8.0-nolibc/powerpc64-linux/bin/ to your PATH.
If using packaged cross compilers on Ubuntu, you may need to set the following environment variable: CROSS=powerpc-linux-gnu-
To test in a simulator, install the IBM POWER8 Functional Simulator from: http://www-304.ibm.com/support/customercare/sas/f/pwrfs/home.html Also see external/mambo/README.md
Qemu (as of 2.2.0) is not suitable as it does not (yet) implement the HyperVisor mode of the POWER8 processor. See https://www.flamingspork.com/blog/2015/08/28/running-opal-in-qemu-the-powernv-platform/ for instructions on how to use a work-in-progress patchset to qemu that may be suitable for some work.
To run a boot-to-bootloader test, you'll need a zImage.papr built using the mambo_defconfig config for op-build. See https://github.com/open-power/op-build/ on howto build. Drop zImage.epapr in the skiboot directory and the skiboot test suite will automatically pick it up.
See opal-ci/README for further testing instructions.
To test on real hardware, you will need to understand how to flash new skiboot onto your system. This will vary from platform to platform.
You may want to start with external/boot-tests/boot_test.sh as it can (provided the correct usernames/passwords) automatically flash a new skiboot onto ASTBMC based OpenPower machines.
All patches should be sent to the mailing list with linux-kernel style ‘Signed-Off-By’. The following git commands are your friends:
git commit -s git format-patch
You probably want to read the linux Documentation/SubmittingPatches as much of it applies to skiboot.
See LICENSE