[gve] Allocate all possible event counters

The admin queue API requires us to tell the device how many event
counters we have provided via the "configure device resources" admin
queue command.  There is, of course, absolutely no documentation
indicating how many event counters actually need to be provided.

We require only two event counters: one for the transmit queue, one
for the receive queue.  (The receive queue doesn't seem to actually
make any use of its event counter, but the "create receive queue"
admin queue command will fail if it doesn't have an available event
counter to choose.)

In the absence of any documentation, we currently make the assumption
that allocating and configuring 16 counters (i.e. one whole cacheline)
will be sufficient to allow for the use of two counters.

This assumption turns out to be incorrect.  On larger instance types
(observed with a c3d-standard-16 instance in europe-west4-a), we find
that creating the transmit or receive queues will each fail with a
probability of around 50% with the "failed precondition" error code.

Experimentation suggests that even though the device has accepted our
"configure device resources" command indicating that we are providing
only 16 event counters, it will attempt to choose any of its potential
32 event counters (and will then fail since the event counter that it
unilaterally chose is outside of the agreed range).

Work around this firmware bug by always allocating the maximum number
of event counters supported by the device.  (This requires deferring
the allocation of the event counters until after issuing the "describe
device" command.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2 files changed
tree: b7ea72b0150564168a8e732dba379f2f9964cd42
  1. .github/
  2. contrib/
  3. src/
  4. COPYING
  5. COPYING.GPLv2
  6. COPYING.UBDL
  7. README