| <!DOCTYPE html> |
| |
| <html lang="en" data-content_root="../"> |
| <head> |
| <meta charset="utf-8" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" /> |
| |
| <title>skiboot-6.3-rc3 — skiboot d365a01 |
| documentation</title> |
| <link rel="stylesheet" type="text/css" href="../_static/pygments.css?v=fa44fd50" /> |
| <link rel="stylesheet" type="text/css" href="../_static/classic.css?v=514cf933" /> |
| |
| <script src="../_static/documentation_options.js?v=e1fecbe9"></script> |
| <script src="../_static/doctools.js?v=888ff710"></script> |
| <script src="../_static/sphinx_highlight.js?v=dc90522c"></script> |
| |
| <link rel="index" title="Index" href="../genindex.html" /> |
| <link rel="search" title="Search" href="../search.html" /> |
| <link rel="next" title="skiboot-6.3.1" href="skiboot-6.3.1.html" /> |
| <link rel="prev" title="skiboot-6.3-rc2" href="skiboot-6.3-rc2.html" /> |
| </head><body> |
| <div class="related" role="navigation" aria-label="related navigation"> |
| <h3>Navigation</h3> |
| <ul> |
| <li class="right" style="margin-right: 10px"> |
| <a href="../genindex.html" title="General Index" |
| accesskey="I">index</a></li> |
| <li class="right" > |
| <a href="skiboot-6.3.1.html" title="skiboot-6.3.1" |
| accesskey="N">next</a> |</li> |
| <li class="right" > |
| <a href="skiboot-6.3-rc2.html" title="skiboot-6.3-rc2" |
| accesskey="P">previous</a> |</li> |
| <li class="nav-item nav-item-0"><a href="../index.html">skiboot d365a01 |
| documentation</a> »</li> |
| <li class="nav-item nav-item-1"><a href="index.html" accesskey="U">Release Notes</a> »</li> |
| <li class="nav-item nav-item-this"><a href="">skiboot-6.3-rc3</a></li> |
| </ul> |
| </div> |
| |
| <div class="document"> |
| <div class="documentwrapper"> |
| <div class="bodywrapper"> |
| <div class="body" role="main"> |
| |
| <section id="skiboot-6-3-rc3"> |
| <span id="id1"></span><h1>skiboot-6.3-rc3<a class="headerlink" href="#skiboot-6-3-rc3" title="Link to this heading">¶</a></h1> |
| <p>skiboot v6.3-rc3 was released on Thursday May 2nd 2019. It is the third |
| release candidate of skiboot 6.3, which will become the new stable release |
| of skiboot following the 6.2 release, first released December 14th 2018.</p> |
| <p>Skiboot 6.3 will mark the basis for op-build v2.3. I expect to tag the final |
| skiboot 6.3 in the next week (I also predicted this last time, so take my |
| predictions with a large amount of sodium).</p> |
| <p>skiboot v6.3-rc3 contains all bug fixes as of <a class="reference internal" href="skiboot-6.0.19.html#skiboot-6-0-19"><span class="std std-ref">skiboot-6.0.19</span></a>, |
| and <a class="reference internal" href="skiboot-6.2.3.html#skiboot-6-2-3"><span class="std std-ref">skiboot-6.2.3</span></a> (the currently maintained |
| stable releases).</p> |
| <p>For how the skiboot stable releases work, see <a class="reference internal" href="../process/stable-skiboot-rules.html#stable-rules"><span class="std std-ref">Skiboot stable tree rules and releases</span></a> for details.</p> |
| <p>Over <a class="reference internal" href="skiboot-6.3-rc2.html#skiboot-6-3-rc2"><span class="std std-ref">skiboot-6.3-rc2</span></a>, we have the following changes:</p> |
| <ul> |
| <li><p>Expose PNOR Flash partitions to host MTD driver via devicetree</p> |
| <p>This makes it possible for the host to directly address each |
| partition without requiring each application to directly parse |
| the FFS headers. This has been in use for some time already to |
| allow BOOTKERNFW partition updates from the host.</p> |
| <p>All partitions except BOOTKERNFW are marked readonly.</p> |
| <p>The BOOTKERNFW partition is currently exclusively used by the TalosII platform</p> |
| </li> |
| <li><p>Write boot progress to LPC port 80h</p> |
| <p>This is an adaptation of what we currently do for op_display() on FSP |
| machines, inventing an encoding for what we can write into the single |
| byte at LPC port 80h.</p> |
| <p>Port 80h is often used on x86 systems to indicate boot progress/status |
| and dates back a decent amount of time. Since a byte isn’t exactly very |
| expressive for everything that can go on (and wrong) during boot, it’s |
| all about compromise.</p> |
| <p>Some systems (such as Zaius/Barreleye G2) have a physical dual 7 segment |
| display that display these codes. So far, this has only been driven by |
| hostboot (see hostboot commit 90ec2e65314c).</p> |
| </li> |
| <li><p>Write boot progress to LPC ports 81 and 82</p> |
| <p>There’s a thought to write more extensive boot progress codes to LPC |
| ports 81 and 82 to supplement/replace any reliance on port 80.</p> |
| <p>We want to still emit port 80 for platforms like Zaius and Barreleye |
| that have the physical display. Ports 81 and 82 can be monitored by a |
| BMC though.</p> |
| </li> |
| <li><p>Copy and convert Romulus descriptors to Talos</p> |
| <p>Talos II has some hardware differences from Romulus, therefore |
| we cannot guarantee Talos II == Romulus in skiboot. Copy and |
| slightly modify the Romulus files for Talos II.</p> |
| </li> |
| <li><p>npu2: Disable Probe-to-Invalid-Return-Modified-or-Owned snarfing by default</p> |
| <p>V100 GPUs are known to violate NVLink2 protocol in some cases (one is when |
| memory was accessed by the CPU and they by GPU using so called block |
| linear mapping) and issue double probes to NPU which can cope with this |
| problem only if CONFIG_ENABLE_SNARF_CPM (“disable/enable Probe.I.MO |
| snarfing a cp_m”) is not set in the CQ_SM Misc Config register #0. |
| If the bit is set (which is the case today), NPU issues the machine |
| check stop.</p> |
| <p>The snarfing feature is designed to detect 2 probes in flight and combine |
| them into one.</p> |
| <p>This adds a new “opal-npu2-snarf-cpm” nvram variable which controls |
| CONFIG_ENABLE_SNARF_CPM for all NVLinks to prevent the machine check |
| stop from happening.</p> |
| <p>This disables snarfing by default as otherwise a broken GPU driver can |
| crash the entire box even when a GPU is passed through to a guest. |
| This provides a dial to allow regression tests (might be useful for |
| a bare metal). To enable snarfing, the user needs to run:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">sudo</span> <span class="n">nvram</span> <span class="o">-</span><span class="n">p</span> <span class="n">ibm</span><span class="p">,</span><span class="n">skiboot</span> <span class="o">--</span><span class="n">update</span><span class="o">-</span><span class="n">config</span> <span class="n">opal</span><span class="o">-</span><span class="n">npu2</span><span class="o">-</span><span class="n">snarf</span><span class="o">-</span><span class="n">cpm</span><span class="o">=</span><span class="n">enable</span> |
| </pre></div> |
| </div> |
| <p>and reboot the host system.</p> |
| </li> |
| <li><p>hw/npu2: Show name of opencapi error interrupts</p></li> |
| <li><p>core/pci: Use PHB io-base-location by default for PHB slots</p> |
| <p>On witherspoon only the GPU slots and the three pluggable PCI slots |
| (SLOT0, 1, 2) have platform defined slot names. For builtin devices such |
| as the SATA controller or the PLX switch that fans out to the GPU slots |
| we have no location codes which some people consider an issue.</p> |
| <p>This patch address the problem by making the ibm,slot-location-code for |
| the root port device default to the ibm,io-base-location-code which is |
| typically the location code for the system itself.</p> |
| <p>e.g.</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">pciex</span><span class="o">@</span><span class="mi">600</span><span class="n">c3c0100000</span><span class="o">/</span><span class="n">ibm</span><span class="p">,</span><span class="n">loc</span><span class="o">-</span><span class="n">code</span> |
| <span class="s2">"UOPWR.0000000-Node0-Proc0"</span> |
| |
| <span class="n">pciex</span><span class="o">@</span><span class="mi">600</span><span class="n">c3c0100000</span><span class="o">/</span><span class="n">pci</span><span class="o">@</span><span class="mi">0</span><span class="o">/</span><span class="n">ibm</span><span class="p">,</span><span class="n">loc</span><span class="o">-</span><span class="n">code</span> |
| <span class="s2">"UOPWR.0000000-Node0-Proc0"</span> |
| |
| <span class="n">pciex</span><span class="o">@</span><span class="mi">600</span><span class="n">c3c0100000</span><span class="o">/</span><span class="n">pci</span><span class="o">@</span><span class="mi">0</span><span class="o">/</span><span class="n">usb</span><span class="o">-</span><span class="n">xhci</span><span class="o">@</span><span class="mi">0</span><span class="o">/</span><span class="n">ibm</span><span class="p">,</span><span class="n">loc</span><span class="o">-</span><span class="n">code</span> |
| <span class="s2">"UOPWR.0000000-Node0"</span> |
| </pre></div> |
| </div> |
| <p>The PHB node, and the root complex nodes have a loc code of the |
| processor they are attached to, while the usb-xhci device under the |
| root port has a location code of the system itself.</p> |
| </li> |
| <li><p>hw/phb4: Read ibm,loc-code from PBCQ node</p> |
| <p>On P9 the PBCQs are subdivided by stacks which implement the PCI Express |
| logic. When phb4 was forked from phb3 most of the properties that were |
| in the pbcq node moved into the stack node, but ibm,loc-code was not one |
| of them. This patch fixes the phb4 init sequence to read the base |
| location code from the PBCQ node (parent of the stack node) rather than |
| the stack node itself.</p> |
| </li> |
| <li><p>hw/xscom: add missing P9P chip name</p></li> |
| <li><p>asm/head: balance branches to avoid link stack predictor mispredicts</p> |
| <p>The Linux wrapper for OPAL call and return is arranged like this:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">__opal_call</span><span class="p">:</span> |
| <span class="n">mflr</span> <span class="n">r0</span> |
| <span class="n">std</span> <span class="n">r0</span><span class="p">,</span><span class="n">PPC_STK_LROFF</span><span class="p">(</span><span class="n">r1</span><span class="p">)</span> |
| <span class="n">LOAD_REG_ADDR</span><span class="p">(</span><span class="n">r11</span><span class="p">,</span> <span class="n">opal_return</span><span class="p">)</span> |
| <span class="n">mtlr</span> <span class="n">r11</span> |
| <span class="n">hrfid</span> <span class="o">-></span> <span class="n">OPAL</span> |
| |
| <span class="n">opal_return</span><span class="p">:</span> |
| <span class="n">ld</span> <span class="n">r0</span><span class="p">,</span><span class="n">PPC_STK_LROFF</span><span class="p">(</span><span class="n">r1</span><span class="p">)</span> |
| <span class="n">mtlr</span> <span class="n">r0</span> |
| <span class="n">blr</span> |
| </pre></div> |
| </div> |
| <p>When skiboot returns to Linux, it branches to LR (i.e., opal_return) |
| with a blr. This unbalances the link stack predictor and will cause |
| mispredicts back up the return stack.</p> |
| </li> |
| <li><p>external/mambo: also invoke readline for the non-autorun case</p></li> |
| <li><p>asm/head.S: set POWER9 radix HID bit at entry</p> |
| <p>When running in virtual memory mode, the radix MMU hid bit should not |
| be changed, so set this in the initial boot SPR setup.</p> |
| <p>As a side effect, fast reboot also has HID0:RADIX bit set by the |
| shared spr init, so no need for an explicit call.</p> |
| </li> |
| <li><p>opal-prd: Fix memory leak in is-fsp-system check</p></li> |
| <li><p>opal-prd: Check malloc return value</p></li> |
| <li><p>hw/phb4: Squash the IO bridge window</p> |
| <p>The PCI-PCI bridge spec says that bridges that implement an IO window |
| should hardcode the IO base and limit registers to zero. |
| Unfortunately, these registers only define the upper bits of the IO |
| window and the low bits are assumed to be 0 for the base and 1 for the |
| limit address. As a result, setting both to zero can be mis-interpreted |
| as a 4K IO window.</p> |
| <p>This patch fixes the problem the same way PHB3 does. It sets the IO base |
| and limit values to 0xf000 and 0x1000 respectively which most software |
| interprets as a disabled window.</p> |
| <p>lspci before patch:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="mi">0000</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mf">00.0</span> <span class="n">PCI</span> <span class="n">bridge</span><span class="p">:</span> <span class="n">IBM</span> <span class="n">Device</span> <span class="mi">04</span><span class="n">c1</span> <span class="p">(</span><span class="n">prog</span><span class="o">-</span><span class="k">if</span> <span class="mi">00</span> <span class="p">[</span><span class="n">Normal</span> <span class="n">decode</span><span class="p">])</span> |
| <span class="n">I</span><span class="o">/</span><span class="n">O</span> <span class="n">behind</span> <span class="n">bridge</span><span class="p">:</span> <span class="mi">00000000</span><span class="o">-</span><span class="mi">00000</span><span class="n">fff</span> |
| </pre></div> |
| </div> |
| <p>lspci after patch:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="mi">0000</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mf">00.0</span> <span class="n">PCI</span> <span class="n">bridge</span><span class="p">:</span> <span class="n">IBM</span> <span class="n">Device</span> <span class="mi">04</span><span class="n">c1</span> <span class="p">(</span><span class="n">prog</span><span class="o">-</span><span class="k">if</span> <span class="mi">00</span> <span class="p">[</span><span class="n">Normal</span> <span class="n">decode</span><span class="p">])</span> |
| <span class="n">I</span><span class="o">/</span><span class="n">O</span> <span class="n">behind</span> <span class="n">bridge</span><span class="p">:</span> <span class="kc">None</span> |
| </pre></div> |
| </div> |
| </li> |
| <li><p>build: link with –orphan-handling=warn</p> |
| <p>The linker can warn when the linker script does not explicitly place |
| all sections. These orphan sections are placed according to |
| heuristics, which may not always be desirable. Enable this warning.</p> |
| </li> |
| <li><p>build: -fno-asynchronous-unwind-tables</p> |
| <p>skiboot does not use unwind tables, this option saves about 100kB, |
| mostly from .text.</p> |
| </li> |
| <li><p>hw/xscom: Enable sw xstop by default on p9</p> |
| <p>This was disabled at some point during bringup to make life easier for |
| the lab folks trying to debug NVLink issues. This hack really should |
| have never made it out into the wild though, so we now have the |
| following situation occuring in the field:</p> |
| <ol class="arabic simple"> |
| <li><p>A bad happens</p></li> |
| <li><p>The host kernel recieves an unrecoverable HMI and calls into OPAL to |
| request a platform reboot.</p></li> |
| <li><p>OPAL rejects the reboot attempt and returns to the kernel with |
| OPAL_PARAMETER.</p></li> |
| <li><p>Kernel panics and attempts to kexec into a kdump kernel.</p></li> |
| </ol> |
| <p>A side effect of the HMI seems to be CPUs becoming stuck which results |
| in the initialisation of the kdump kernel taking a extremely long time |
| (6+ hours). It’s also been observed that after performing a dump the |
| kdump kernel then crashes itself because OPAL has ended up in a bad |
| state as a side effect of the HMI.</p> |
| <p>All up, it’s not very good so re-enable the software checkstop by |
| default. If people still want to turn it off they can using the nvram |
| override.</p> |
| </li> |
| <li><p>opal/hmi: Initialize the hmi event with old value of TFMR.</p> |
| <p>Do this before we fix TFAC errors. Otherwise the event at host console |
| shows no thread error reported in TFMR register.</p> |
| <p>Without this patch the console event show TFMR with no thread error: |
| (DEC parity error TFMR[59] injection)</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[</span> <span class="mf">53.737572</span><span class="p">]</span> <span class="n">Severe</span> <span class="n">Hypervisor</span> <span class="n">Maintenance</span> <span class="n">interrupt</span> <span class="p">[</span><span class="n">Recovered</span><span class="p">]</span> |
| <span class="p">[</span> <span class="mf">53.737596</span><span class="p">]</span> <span class="n">Error</span> <span class="n">detail</span><span class="p">:</span> <span class="n">Timer</span> <span class="n">facility</span> <span class="n">experienced</span> <span class="n">an</span> <span class="n">error</span> |
| <span class="p">[</span> <span class="mf">53.737611</span><span class="p">]</span> <span class="n">HMER</span><span class="p">:</span> <span class="mi">0840000000000000</span> |
| <span class="p">[</span> <span class="mf">53.737621</span><span class="p">]</span> <span class="n">TFMR</span><span class="p">:</span> <span class="mf">3212000870e04000</span> |
| </pre></div> |
| </div> |
| <p>After this patch it shows old TFMR value on host console:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[</span> <span class="mf">2302.267271</span><span class="p">]</span> <span class="n">Severe</span> <span class="n">Hypervisor</span> <span class="n">Maintenance</span> <span class="n">interrupt</span> <span class="p">[</span><span class="n">Recovered</span><span class="p">]</span> |
| <span class="p">[</span> <span class="mf">2302.267305</span><span class="p">]</span> <span class="n">Error</span> <span class="n">detail</span><span class="p">:</span> <span class="n">Timer</span> <span class="n">facility</span> <span class="n">experienced</span> <span class="n">an</span> <span class="n">error</span> |
| <span class="p">[</span> <span class="mf">2302.267320</span><span class="p">]</span> <span class="n">HMER</span><span class="p">:</span> <span class="mi">0840000000000000</span> |
| <span class="p">[</span> <span class="mf">2302.267330</span><span class="p">]</span> <span class="n">TFMR</span><span class="p">:</span> <span class="mf">3212000870e14010</span> |
| </pre></div> |
| </div> |
| </li> |
| </ul> |
| </section> |
| |
| |
| <div class="clearer"></div> |
| </div> |
| </div> |
| </div> |
| <div class="sphinxsidebar" role="navigation" aria-label="main navigation"> |
| <div class="sphinxsidebarwrapper"> |
| <div> |
| <h4>Previous topic</h4> |
| <p class="topless"><a href="skiboot-6.3-rc2.html" |
| title="previous chapter">skiboot-6.3-rc2</a></p> |
| </div> |
| <div> |
| <h4>Next topic</h4> |
| <p class="topless"><a href="skiboot-6.3.1.html" |
| title="next chapter">skiboot-6.3.1</a></p> |
| </div> |
| <div role="note" aria-label="source link"> |
| <h3>This Page</h3> |
| <ul class="this-page-menu"> |
| <li><a href="../_sources/release-notes/skiboot-6.3-rc3.rst.txt" |
| rel="nofollow">Show Source</a></li> |
| </ul> |
| </div> |
| <div id="searchbox" style="display: none" role="search"> |
| <h3 id="searchlabel">Quick search</h3> |
| <div class="searchformwrapper"> |
| <form class="search" action="../search.html" method="get"> |
| <input type="text" name="q" aria-labelledby="searchlabel" autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false"/> |
| <input type="submit" value="Go" /> |
| </form> |
| </div> |
| </div> |
| <script>document.getElementById('searchbox').style.display = "block"</script> |
| </div> |
| </div> |
| <div class="clearer"></div> |
| </div> |
| <div class="related" role="navigation" aria-label="related navigation"> |
| <h3>Navigation</h3> |
| <ul> |
| <li class="right" style="margin-right: 10px"> |
| <a href="../genindex.html" title="General Index" |
| >index</a></li> |
| <li class="right" > |
| <a href="skiboot-6.3.1.html" title="skiboot-6.3.1" |
| >next</a> |</li> |
| <li class="right" > |
| <a href="skiboot-6.3-rc2.html" title="skiboot-6.3-rc2" |
| >previous</a> |</li> |
| <li class="nav-item nav-item-0"><a href="../index.html">skiboot d365a01 |
| documentation</a> »</li> |
| <li class="nav-item nav-item-1"><a href="index.html" >Release Notes</a> »</li> |
| <li class="nav-item nav-item-this"><a href="">skiboot-6.3-rc3</a></li> |
| </ul> |
| </div> |
| <div class="footer" role="contentinfo"> |
| © Copyright 2016-2017, IBM, others. |
| Created using <a href="https://www.sphinx-doc.org/">Sphinx</a> 7.2.6. |
| </div> |
| </body> |
| </html> |