Conversation
|
Example output, PF device:
|
|
Example output, VF device:
|
README.md
Outdated
| *This API is PROVISIONAL! ghw will try hard to not make breaking changes to this API, but still users are advice this new API is not | ||
| declared stable yet. We expect to declare it stable with ghw version 1.0.0* | ||
|
|
||
| SRIOV (Single-Root Input/Output Virtualization) is a class of PCI devices that ghw treats explicitely, like gpus; but unlike |
There was a problem hiding this comment.
s/treats/handles/
7e97f20 to
5e67349
Compare
|
We are converging about which information we should expose about SRIOV devices. The current set of attributes represents what other established components, for example the k8s SRIOV operator exposes . Please note that in case of ghw, the same amount of data is split between the The biggest problem however is how to represent the SRIOV devices proper. Should SRIOV be a subpackage of the |
bbe0fb9 to
8b33ae8
Compare
Had a chat with @jaypipes and the my takeaways are:
|
8b33ae8 to
8cf64a0
Compare
2f4aee4 to
b146139
Compare
|
ok, test failure is unexpected. I'll have a look ASAP. |
weird, can't reproduce inside a xenial container: |
|
same with go1.14.15 (yep, copypasta here #230 (comment)) |
9b6c782 to
f88d4a5
Compare
|
Interesting. We had apparently-random CI failures on xenial only. Now I wonder if the default behaviour of |
Should the pcidb pkg fail to find a local pci database, it optionally fetches an updated copy from the network. Even though this is unlikely to happen, and even though this is a nice feature to have, this is enabled by default and can lead to hard-to-debug issues. we seen this first with failing CI on xenial in travis (jaypipes#230 (comment)) It's possible the ultimate cause is out of date test VM, or any CI failure; but it is also true this a symptom of a deeper issue: test environments should be tightly controlled in order to be reproducible. Fetching data from the network, even though from a safe and trusted location, voids this assumption. So we need to force this off in the test path. Unfortunately this patch has its own drawback: it changes the pcidb default behaviour, so test code and production code have different preferred code paths. To really fix this behaviour completely we should probably file a pcidb issue (+ PR) to change its defaults. Signed-off-by: Francesco Romani <fromani@redhat.com>
0f62884 to
62a092d
Compare
Well this still have some merits, turns out it was in turn a symptom. The real issue -with is fix- is captured in 62a092d |
080eb0b to
1b41001
Compare
1b41001 to
1e9aa06
Compare
ea52d44 to
4715e44
Compare
4715e44 to
cf6f510
Compare
|
back to WIP: need to finish the rebase, and I'd like to land also #281 before to finish this PR. |
cf6f510 to
40ad1e1
Compare
|
All the deps of this PR have been merged! removing the WIP status |
40ad1e1 to
ce233e2
Compare
|
@jaypipes at last! all the dependencies of this PR have been merged, CI is green and it's ready for a new review round! |
|
Fixes: #92 |
|
Example output (long): |
ce233e2 to
901f511
Compare
Add support to report SRIOV devices. Much like GPU devices, support is added using a new top-level package. SRIOV devices are either Physical Functions or Virtual functions. The preferred representation for ghw is Physical Functions, whose dependent devices will be Virtual Functions; however, for the sake of practicality, the API also exposes soft references to Virtual Functions, so consumers of the API can access them directly and not navigating the parent devices. This patch also adds support in `ghwc`, to report the sriov information, and in the `snapshot` package, to make sure to capture all the files in sysfs that ghw cares about. Last but not least, lacking access to suitable non-linux systems, support is provided only on linux OS, even though the API tries hard not to be linux-specific. Resolves: jaypipes#92 Signed-off-by: Francesco Romani <fromani@redhat.com>
901f511 to
9058f61
Compare
jaypipes
left a comment
There was a problem hiding this comment.
@fromanirh, the more I look at this patch, the more I think these changes belong in the pkg/pci package and not as a new pkg/sriov package. The fact is, SR-IOV is specific to PCI Express and therefore should really be a set of additional attributes on the existing pkg/pci.Device struct.
See inline for more explanation.
|
|
||
| ### SRIOV | ||
|
|
||
| *This API is PROVISIONAL! ghw will try hard to not make breaking changes to this API, but still users are advice this new API is not |
There was a problem hiding this comment.
Recommend removing this. :) The interfaces we publish are specific to a Git tag or SHA1 per Go package versioning and semver behaviour.
There was a problem hiding this comment.
no problem! will remove.
| "virtfn*", | ||
| } | ||
|
|
||
| ignoreSet := newKeySet( |
There was a problem hiding this comment.
I don't think it's necessary to create a new keySet typedef. We can just use map[string]bool here and where you're doing the ignoreSet.Contains call below, just do:
if _, ignored := ignoreSet[globbedEntry]; ignored {
continue
}There was a problem hiding this comment.
sure thing, will simplify
| } | ||
| } | ||
|
|
||
| func findNetworks(ctx *context.Context, devPath string) []string { |
There was a problem hiding this comment.
are they network names or are they network interface names? :)
There was a problem hiding this comment.
yes, they are actually network interface names. Will clarify the function name.
| type Device struct { | ||
| Interfaces []string `json:"interfaces"` | ||
| // the PCI address where the SRIOV instance can be found | ||
| Address *pciaddr.Address `json:"address"` | ||
| PCI *pci.Device `json:"pci"` | ||
| } |
There was a problem hiding this comment.
Instead of creating a new pkg/sriov package and pkg/sriov.Device structI think we should just modify thepkg/pci.Device` struct to read SR-IOV (and network interface) information when the PCI device is a PF or VF.
We should add a new Interface struct to pkg/net:
// Interface represents a physical or virtual network interface on the host
type Interface struct {
// Name is the name of the interface on the host
Name string `json:"name"`
}We can then modify the pkg/pci.Device struct to add the following:
...
import (
...
"github.com/jaypipes/ghw/pkg/net"
...
)
...
type Device struct {
...
// NetworkInterfaces is the collection of network interfaces housed on this
// PCI device.
NetworkInterfaces []net.Interface `json:"network_interfaces,omitempty"`
}There was a problem hiding this comment.
This is an interesting idea. I'd like to process the concept a bit more to fully grok it but doesn't look bad at all. However this also mean that we aim to absorb all the PCI device type (= gpu) into the PCI package in the long run?
Or it's just SRIOV which is not special enough? (again no strong feeling, but wondering about the general direction)
| } | ||
|
|
||
| type PhysicalFunction struct { | ||
| Device |
There was a problem hiding this comment.
And here, we'd inherit from pkg/pci.Device instead.
| } | ||
|
|
||
| type VirtualFunction struct { | ||
| Device |
|
|
||
| type VirtualFunction struct { | ||
| Device | ||
| ID int `json:"id"` |
There was a problem hiding this comment.
Is this the index of the VF within the PF? If so, should we call this Index?
There was a problem hiding this comment.
Yep, it should be called probably Index and I should clarify it with a code comment.
| // Address of the (parent) Physical Function this Virtual Function pertains to. | ||
| ParentAddress *pciaddr.Address `json:"parent_address,omitempty"` |
There was a problem hiding this comment.
Should we just store a pointer to the parent pkg/pci.Device struct instead?
There was a problem hiding this comment.
Come to think of it, could we just use a single Function struct to describe all of the SR-IOV functions, both physical and virtual?
// Function describes an SR-IOV physical or virtual function. Physical functions
// will have no Parent Function struct pointer and will have one or more Function
// structs in the Functions field.
type Function struct {
Device // All Functions are PCI Devices
// Parent contains a pointer to the parent physical function.
// Will be empty when this is a physical function
Parent *Function `json:"parent,omitempty"`
// MaxVFs contains the maximum number of supported virtual
// functions for this physical function
MaxVFs int `json:"max_vfs"
// Functions contains the physical function's virtual functions
Functions []*Function `json:"functions"`
}You could create a simple IsPhysical utility method on pkg/pci.Function:
// IsPhysical returns true if the PCIe function is a physical function, false
// if it is a virtual function
func (f *Function) IsPhysical() bool {
return f.Parent == nil
}There was a problem hiding this comment.
Interesting angle, I don't have again strong feelings about this idea. With respect the current approach, we lose something, we gain something. I'll file a separate PR so we can compare the two approaches, and I'll close the one we like less (with rationale of course).
|
sorry for the delay, caused by winter holidays plus crazy january. I'll resume work here ASAP, let's try this variation! |
(at LONG last I finally got some time to resume working on this PR!) Again, I don't have strong feelings here, but this (perceived?) inconsistency makes me think. |
|
at long last, the alternate implementation is available: #315 |
|
we agreed to move forward with #315 |
Expose informations about SRIOV devices. After some design iterations, we fixed these goals:
There are few more noteworthy items in this PR:
Fixes: #92
Signed-off-by: Francesco Romani fromani@redhat.com