| <!DOCTYPE html> |
| |
| <html lang="en" data-content_root="../"> |
| <head> |
| <meta charset="utf-8" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" /> |
| |
| <title>skiboot-5.4.8 — skiboot d365a01 |
| documentation</title> |
| <link rel="stylesheet" type="text/css" href="../_static/pygments.css?v=fa44fd50" /> |
| <link rel="stylesheet" type="text/css" href="../_static/classic.css?v=514cf933" /> |
| |
| <script src="../_static/documentation_options.js?v=e1fecbe9"></script> |
| <script src="../_static/doctools.js?v=888ff710"></script> |
| <script src="../_static/sphinx_highlight.js?v=dc90522c"></script> |
| |
| <link rel="index" title="Index" href="../genindex.html" /> |
| <link rel="search" title="Search" href="../search.html" /> |
| <link rel="next" title="skiboot-5.4.9" href="skiboot-5.4.9.html" /> |
| <link rel="prev" title="skiboot-5.4.7" href="skiboot-5.4.7.html" /> |
| </head><body> |
| <div class="related" role="navigation" aria-label="related navigation"> |
| <h3>Navigation</h3> |
| <ul> |
| <li class="right" style="margin-right: 10px"> |
| <a href="../genindex.html" title="General Index" |
| accesskey="I">index</a></li> |
| <li class="right" > |
| <a href="skiboot-5.4.9.html" title="skiboot-5.4.9" |
| accesskey="N">next</a> |</li> |
| <li class="right" > |
| <a href="skiboot-5.4.7.html" title="skiboot-5.4.7" |
| accesskey="P">previous</a> |</li> |
| <li class="nav-item nav-item-0"><a href="../index.html">skiboot d365a01 |
| documentation</a> »</li> |
| <li class="nav-item nav-item-1"><a href="index.html" accesskey="U">Release Notes</a> »</li> |
| <li class="nav-item nav-item-this"><a href="">skiboot-5.4.8</a></li> |
| </ul> |
| </div> |
| |
| <div class="document"> |
| <div class="documentwrapper"> |
| <div class="bodywrapper"> |
| <div class="body" role="main"> |
| |
| <section id="skiboot-5-4-8"> |
| <span id="id1"></span><h1>skiboot-5.4.8<a class="headerlink" href="#skiboot-5-4-8" title="Link to this heading">¶</a></h1> |
| <p>skiboot-5.4.8 was released on Wednesday October 11th, 2017. It replaces |
| <a class="reference internal" href="skiboot-5.4.7.html#skiboot-5-4-7"><span class="std std-ref">skiboot-5.4.7</span></a> as the current stable release in the 5.4.x series.</p> |
| <p>Over <a class="reference internal" href="skiboot-5.4.7.html#skiboot-5-4-7"><span class="std std-ref">skiboot-5.4.7</span></a>, we have a few bug fixes for FSP platforms:</p> |
| <ul> |
| <li><p>libflash/file: Handle short read()s and write()s correctly</p> |
| <p>Currently we don’t move the buffer along for a short read() or write() |
| and nor do we request only the remaining amount.</p> |
| </li> |
| <li><p>FSP/NVRAM: Handle “get vNVRAM statistics” command</p> |
| <p>FSP sends MBOX command (cmd : 0xEB, subcmd : 0x05, mod : 0x00) to get vNVRAM |
| statistics. OPAL doesn’t maintain any such statistics. Hence return |
| FSP_STATUS_INVALID_SUBCMD.</p> |
| <blockquote> |
| <div><p>Sample OPAL log:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[</span><span class="mf">16944.384670488</span><span class="p">,</span><span class="mi">3</span><span class="p">]</span> <span class="n">FSP</span><span class="p">:</span> <span class="n">Unhandled</span> <span class="n">message</span> <span class="n">eb0500</span> |
| <span class="p">[</span><span class="mf">16944.474110465</span><span class="p">,</span><span class="mi">3</span><span class="p">]</span> <span class="n">FSP</span><span class="p">:</span> <span class="n">Unhandled</span> <span class="n">message</span> <span class="n">eb0500</span> |
| <span class="p">[</span><span class="mf">16945.111280784</span><span class="p">,</span><span class="mi">3</span><span class="p">]</span> <span class="n">FSP</span><span class="p">:</span> <span class="n">Unhandled</span> <span class="n">message</span> <span class="n">eb0500</span> |
| <span class="p">[</span><span class="mf">16945.293393485</span><span class="p">,</span><span class="mi">3</span><span class="p">]</span> <span class="n">FSP</span><span class="p">:</span> <span class="n">Unhandled</span> <span class="n">message</span> <span class="n">eb0500</span> |
| </pre></div> |
| </div> |
| </div></blockquote> |
| </li> |
| <li><p>FSP/CONSOLE: Limit number of error logging</p> |
| <p>Commit c8a7535f (FSP/CONSOLE: Workaround for unresponsive ipmi daemon, added |
| in skiboot 5.4.6 and 5.7-rc1) added error logging when buffer is full. In some |
| corner cases kernel may call this function multiple time and we may endup logging |
| error again and again.</p> |
| <p>This patch fixes it by generating error log only once.</p> |
| </li> |
| <li><p>FSP/CONSOLE: Fix fsp_console_write_buffer_space() call</p> |
| <p>Kernel calls fsp_console_write_buffer_space() to check console buffer space |
| availability. If there is enough buffer space to write data, then kernel will |
| call fsp_console_write() to write actual data.</p> |
| <p>In some extreme corner cases (like one explained in commit c8a7535f) |
| console becomes full and this function returns 0 to kernel (or space available |
| in console buffer < next incoming data size). Kernel will continue retrying |
| until it gets enough space. So we will start seeing RCU stalls.</p> |
| <p>This patch keeps track of previous available space. If previous space is same |
| as current means not enough space in console buffer to write incoming data. |
| It may be due to very high console write operation and slow response from FSP |
| -OR- FSP has stopped processing data (ex: because of ipmi daemon died). At this |
| point we will start timer with timeout of SER_BUFFER_OUT_TIMEOUT (10 secs). |
| If situation is not improved within 10 seconds means something went bad. Lets |
| return OPAL_RESOURCE so that kernel can drop console write and continue.</p> |
| </li> |
| <li><p>FSP/CONSOLE: Close SOL session during R/R</p> |
| <p>Presently we are not closing SOL and FW console sessions during R/R. Host will |
| continue to write to SOL buffer during FSP R/R. If there is heavy console write |
| operation happening during FSP R/R (like running <cite>top</cite> command inside console), |
| then at some point console buffer becomes full. fsp_console_write_buffer_space() |
| returns 0 (or less than required space to write data) to host. While one thread |
| is busy writing to console, if some other threads tries to write data to console |
| we may see RCU stalls (like below) in kernel.</p> |
| <p>kernel call trace:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[</span> <span class="mf">2082.828363</span><span class="p">]</span> <span class="n">INFO</span><span class="p">:</span> <span class="n">rcu_sched</span> <span class="n">detected</span> <span class="n">stalls</span> <span class="n">on</span> <span class="n">CPUs</span><span class="o">/</span><span class="n">tasks</span><span class="p">:</span> <span class="p">{</span> <span class="mi">32</span><span class="p">}</span> <span class="p">(</span><span class="n">detected</span> <span class="n">by</span> <span class="mi">16</span><span class="p">,</span> <span class="n">t</span><span class="o">=</span><span class="mi">6002</span> <span class="n">jiffies</span><span class="p">,</span> <span class="n">g</span><span class="o">=</span><span class="mi">23154</span><span class="p">,</span> <span class="n">c</span><span class="o">=</span><span class="mi">23153</span><span class="p">,</span> <span class="n">q</span><span class="o">=</span><span class="mi">254769</span><span class="p">)</span> |
| <span class="p">[</span> <span class="mf">2082.828365</span><span class="p">]</span> <span class="n">Task</span> <span class="n">dump</span> <span class="k">for</span> <span class="n">CPU</span> <span class="mi">32</span><span class="p">:</span> |
| <span class="p">[</span> <span class="mf">2082.828368</span><span class="p">]</span> <span class="n">kworker</span><span class="o">/</span><span class="mi">32</span><span class="p">:</span><span class="mi">3</span> <span class="n">R</span> <span class="n">running</span> <span class="n">task</span> <span class="mi">0</span> <span class="mi">4637</span> <span class="mi">2</span> <span class="mh">0x00000884</span> |
| <span class="p">[</span> <span class="mf">2082.828375</span><span class="p">]</span> <span class="n">Workqueue</span><span class="p">:</span> <span class="n">events</span> <span class="n">dump_work_fn</span> |
| <span class="p">[</span> <span class="mf">2082.828376</span><span class="p">]</span> <span class="n">Call</span> <span class="n">Trace</span><span class="p">:</span> |
| <span class="p">[</span> <span class="mf">2082.828382</span><span class="p">]</span> <span class="p">[</span><span class="n">c000000f1633fa00</span><span class="p">]</span> <span class="p">[</span><span class="n">c00000000013b6b0</span><span class="p">]</span> <span class="n">console_unlock</span><span class="o">+</span><span class="mh">0x570</span><span class="o">/</span><span class="mh">0x600</span> <span class="p">(</span><span class="n">unreliable</span><span class="p">)</span> |
| <span class="p">[</span> <span class="mf">2082.828384</span><span class="p">]</span> <span class="p">[</span><span class="n">c000000f1633fae0</span><span class="p">]</span> <span class="p">[</span><span class="n">c00000000013ba34</span><span class="p">]</span> <span class="n">vprintk_emit</span><span class="o">+</span><span class="mh">0x2f4</span><span class="o">/</span><span class="mh">0x5c0</span> |
| <span class="p">[</span> <span class="mf">2082.828389</span><span class="p">]</span> <span class="p">[</span><span class="n">c000000f1633fb60</span><span class="p">]</span> <span class="p">[</span><span class="n">c00000000099e644</span><span class="p">]</span> <span class="n">printk</span><span class="o">+</span><span class="mh">0x84</span><span class="o">/</span><span class="mh">0x98</span> |
| <span class="p">[</span> <span class="mf">2082.828391</span><span class="p">]</span> <span class="p">[</span><span class="n">c000000f1633fb90</span><span class="p">]</span> <span class="p">[</span><span class="n">c0000000000851a8</span><span class="p">]</span> <span class="n">dump_work_fn</span><span class="o">+</span><span class="mh">0x238</span><span class="o">/</span><span class="mh">0x250</span> |
| <span class="p">[</span> <span class="mf">2082.828394</span><span class="p">]</span> <span class="p">[</span><span class="n">c000000f1633fc60</span><span class="p">]</span> <span class="p">[</span><span class="n">c0000000000ecb98</span><span class="p">]</span> <span class="n">process_one_work</span><span class="o">+</span><span class="mh">0x198</span><span class="o">/</span><span class="mh">0x4b0</span> |
| <span class="p">[</span> <span class="mf">2082.828396</span><span class="p">]</span> <span class="p">[</span><span class="n">c000000f1633fcf0</span><span class="p">]</span> <span class="p">[</span><span class="n">c0000000000ed3dc</span><span class="p">]</span> <span class="n">worker_thread</span><span class="o">+</span><span class="mh">0x18c</span><span class="o">/</span><span class="mh">0x5a0</span> |
| <span class="p">[</span> <span class="mf">2082.828399</span><span class="p">]</span> <span class="p">[</span><span class="n">c000000f1633fd80</span><span class="p">]</span> <span class="p">[</span><span class="n">c0000000000f4650</span><span class="p">]</span> <span class="n">kthread</span><span class="o">+</span><span class="mh">0x110</span><span class="o">/</span><span class="mh">0x130</span> |
| <span class="p">[</span> <span class="mf">2082.828403</span><span class="p">]</span> <span class="p">[</span><span class="n">c000000f1633fe30</span><span class="p">]</span> <span class="p">[</span><span class="n">c000000000009674</span><span class="p">]</span> <span class="n">ret_from_kernel_thread</span><span class="o">+</span><span class="mh">0x5c</span><span class="o">/</span><span class="mh">0x68</span> |
| </pre></div> |
| </div> |
| <p>Hence lets close SOL (and FW console) during FSP R/R.</p> |
| </li> |
| <li><p>FSP/CONSOLE: Do not associate unavailable console</p> |
| <p>Presently OPAL sends associate/unassociate MBOX command for all |
| FSP serial console (like below OPAL message). We have to check |
| console is available or not before sending this message.</p> |
| <p>OPAL log:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[</span> <span class="mf">5013.227994012</span><span class="p">,</span><span class="mi">7</span><span class="p">]</span> <span class="n">FSP</span><span class="p">:</span> <span class="n">Reassociating</span> <span class="n">HVSI</span> <span class="n">console</span> <span class="mi">1</span> |
| <span class="p">[</span> <span class="mf">5013.227997540</span><span class="p">,</span><span class="mi">7</span><span class="p">]</span> <span class="n">FSP</span><span class="p">:</span> <span class="n">Reassociating</span> <span class="n">HVSI</span> <span class="n">console</span> <span class="mi">2</span> |
| </pre></div> |
| </div> |
| </li> |
| <li><p>FSP: Disable PSI link whenever FSP tells OPAL about impending Reset/Reload</p> |
| <p>Commit 42d5d047 fixed scenario where DPO has been initiated, but FSP went |
| into reset before the CEC power down came in. But this is generic issue |
| that can happen in normal shutdown path as well.</p> |
| <p>Hence disable PSI link as soon as we detect FSP impending R/R.</p> |
| </li> |
| <li><p>fsp: return OPAL_BUSY_EVENT on failure sending FSP_CMD_POWERDOWN_NORM |
| Also, return OPAL_BUSY_EVENT on failure sending FSP_CMD_REBOOT / DEEP_REBOOT.</p> |
| <p>We had a race condition between FSP Reset/Reload and powering down |
| the system from the host:</p> |
| <p>Roughly:</p> |
| <table class="docutils align-default"> |
| <thead> |
| <tr class="row-odd"><th class="head"><p>#</p></th> |
| <th class="head"><p>FSP</p></th> |
| <th class="head"><p>Host</p></th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr class="row-even"><td><p>1</p></td> |
| <td><p>Power on</p></td> |
| <td></td> |
| </tr> |
| <tr class="row-odd"><td><p>2</p></td> |
| <td></td> |
| <td><p>Power on</p></td> |
| </tr> |
| <tr class="row-even"><td><p>3</p></td> |
| <td><p>(inject EPOW)</p></td> |
| <td></td> |
| </tr> |
| <tr class="row-odd"><td><p>4</p></td> |
| <td><p>(trigger FSP R/R)</p></td> |
| <td></td> |
| </tr> |
| <tr class="row-even"><td><p>5</p></td> |
| <td></td> |
| <td><p>Processes EPOW event, starts shutting down</p></td> |
| </tr> |
| <tr class="row-odd"><td><p>6</p></td> |
| <td></td> |
| <td><p>calls OPAL_CEC_POWER_DOWN</p></td> |
| </tr> |
| <tr class="row-even"><td><p>7</p></td> |
| <td><p>(is still in R/R)</p></td> |
| <td></td> |
| </tr> |
| <tr class="row-odd"><td><p>8</p></td> |
| <td></td> |
| <td><p>gets OPAL_INTERNAL_ERROR, spins in opal_poll_events</p></td> |
| </tr> |
| <tr class="row-even"><td><p>9</p></td> |
| <td><p>(FSP comes back)</p></td> |
| <td></td> |
| </tr> |
| <tr class="row-odd"><td><p>10</p></td> |
| <td></td> |
| <td><p>spinning in opal_poll_events</p></td> |
| </tr> |
| <tr class="row-even"><td><p>11</p></td> |
| <td><p>(thinks host is running)</p></td> |
| <td></td> |
| </tr> |
| </tbody> |
| </table> |
| <p>The call to OPAL_CEC_POWER_DOWN is only made once as the reset/reload |
| error path for fsp_sync_msg() is to return -1, which means we give |
| the OS OPAL_INTERNAL_ERROR, which is fine, except that our own API |
| docs give us the opportunity to return OPAL_BUSY when trying again |
| later may be successful, and we’re ambiguous as to if you should retry |
| on OPAL_INTERNAL_ERROR.</p> |
| <p>For reference, the linux code looks like this:</p> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">static</span> <span class="n">void</span> <span class="n">__noreturn</span> <span class="n">pnv_power_off</span><span class="p">(</span><span class="n">void</span><span class="p">)</span> |
| <span class="p">{</span> |
| <span class="n">long</span> <span class="n">rc</span> <span class="o">=</span> <span class="n">OPAL_BUSY</span><span class="p">;</span> |
| |
| <span class="n">pnv_prepare_going_down</span><span class="p">();</span> |
| |
| <span class="k">while</span> <span class="p">(</span><span class="n">rc</span> <span class="o">==</span> <span class="n">OPAL_BUSY</span> <span class="o">||</span> <span class="n">rc</span> <span class="o">==</span> <span class="n">OPAL_BUSY_EVENT</span><span class="p">)</span> <span class="p">{</span> |
| <span class="n">rc</span> <span class="o">=</span> <span class="n">opal_cec_power_down</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span> |
| <span class="k">if</span> <span class="p">(</span><span class="n">rc</span> <span class="o">==</span> <span class="n">OPAL_BUSY_EVENT</span><span class="p">)</span> |
| <span class="n">opal_poll_events</span><span class="p">(</span><span class="n">NULL</span><span class="p">);</span> |
| <span class="k">else</span> |
| <span class="n">mdelay</span><span class="p">(</span><span class="mi">10</span><span class="p">);</span> |
| <span class="p">}</span> |
| <span class="k">for</span> <span class="p">(;;)</span> |
| <span class="n">opal_poll_events</span><span class="p">(</span><span class="n">NULL</span><span class="p">);</span> |
| <span class="p">}</span> |
| </pre></div> |
| </div> |
| <p>Which means that <em>practically</em> our only option is to return OPAL_BUSY |
| or OPAL_BUSY_EVENT.</p> |
| <p>We choose OPAL_BUSY_EVENT for FSP systems as we do want to ensure we’re |
| running pollers to communicate with the FSP and do the final bits of |
| Reset/Reload handling before we power off the system.</p> |
| </li> |
| </ul> |
| </section> |
| |
| |
| <div class="clearer"></div> |
| </div> |
| </div> |
| </div> |
| <div class="sphinxsidebar" role="navigation" aria-label="main navigation"> |
| <div class="sphinxsidebarwrapper"> |
| <div> |
| <h4>Previous topic</h4> |
| <p class="topless"><a href="skiboot-5.4.7.html" |
| title="previous chapter">skiboot-5.4.7</a></p> |
| </div> |
| <div> |
| <h4>Next topic</h4> |
| <p class="topless"><a href="skiboot-5.4.9.html" |
| title="next chapter">skiboot-5.4.9</a></p> |
| </div> |
| <div role="note" aria-label="source link"> |
| <h3>This Page</h3> |
| <ul class="this-page-menu"> |
| <li><a href="../_sources/release-notes/skiboot-5.4.8.rst.txt" |
| rel="nofollow">Show Source</a></li> |
| </ul> |
| </div> |
| <div id="searchbox" style="display: none" role="search"> |
| <h3 id="searchlabel">Quick search</h3> |
| <div class="searchformwrapper"> |
| <form class="search" action="../search.html" method="get"> |
| <input type="text" name="q" aria-labelledby="searchlabel" autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false"/> |
| <input type="submit" value="Go" /> |
| </form> |
| </div> |
| </div> |
| <script>document.getElementById('searchbox').style.display = "block"</script> |
| </div> |
| </div> |
| <div class="clearer"></div> |
| </div> |
| <div class="related" role="navigation" aria-label="related navigation"> |
| <h3>Navigation</h3> |
| <ul> |
| <li class="right" style="margin-right: 10px"> |
| <a href="../genindex.html" title="General Index" |
| >index</a></li> |
| <li class="right" > |
| <a href="skiboot-5.4.9.html" title="skiboot-5.4.9" |
| >next</a> |</li> |
| <li class="right" > |
| <a href="skiboot-5.4.7.html" title="skiboot-5.4.7" |
| >previous</a> |</li> |
| <li class="nav-item nav-item-0"><a href="../index.html">skiboot d365a01 |
| documentation</a> »</li> |
| <li class="nav-item nav-item-1"><a href="index.html" >Release Notes</a> »</li> |
| <li class="nav-item nav-item-this"><a href="">skiboot-5.4.8</a></li> |
| </ul> |
| </div> |
| <div class="footer" role="contentinfo"> |
| © Copyright 2016-2017, IBM, others. |
| Created using <a href="https://www.sphinx-doc.org/">Sphinx</a> 7.2.6. |
| </div> |
| </body> |
| </html> |