Skip to content
Merged
200 changes: 129 additions & 71 deletions transport-msg.tex
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,11 @@ \subsubsection{Purpose}
\item \textbf{Support multiple bus implementations:}
Systems may rely on various communication methods such as hypercalls, local
IPC, network channels, or device trees for enumerating devices. virtio-msg
defines a common transport interface suitable for any of these underlying
mechanisms.
defines a common transport interface suitable for any of these mechanisms.

\item \textbf{Reduce per-bus complexity:}
Buses can implement a fully message-based workflow (including optional
enumeration via \busref{GET_DEVICES} and hotplug via \busref{EVENT_DEVICE}
enumeration via \busref{GET_DEVICES} and hotplug via \busref{EVENT_DEVICE})
or they can discover and manage devices through
alternative means such as platform firmware data. In either case, they
forward transport messages to and from each device.
Expand Down Expand Up @@ -95,8 +94,9 @@ \subsubsection{Optional Bus Messages}
completely message-based approach to enumeration, hotplug, and bus-wide health.
However, these are \emph{not} mandatory if a bus instance already handles those
functions via firmware, device tree, or other platform features. The only strict
requirement is that the bus be able to forward device-specific \emph{transport
messages} once a device is recognized, so the virtio-msg driver can manage it.
requirement is that the bus \emph{MUST} be able to forward device-specific
\emph{transport messages} once a device is recognized, so the virtio-msg driver
can manage it.

\subsection{Basic Concepts}
\label{sec:Virtio Transport Options / Virtio Over Messages / Basic Concepts}
Expand All @@ -110,7 +110,8 @@ \subsection{Basic Concepts}
\subsubsection{Transport Revisions and Maximum Message Size}
\label{sec:Virtio Transport Options / Virtio Over Messages / Basic Concepts / Revisions}

Each \textbf{virtio-msg bus instance} advertises:
Each \textbf{virtio-msg bus instance} advertises the following to the transport
layer:
\begin{itemize}
\item A \textbf{transport revision} indicating the protocol version it
supports. This revision is separate from the overall Virtio
Expand All @@ -122,24 +123,24 @@ \subsubsection{Transport Revisions and Maximum Message Size}
\end{itemize}

These parameters \emph{MAY} vary between bus instances within the same system.
The driver obtains a bus's revision, maximum message size and list of features
through an \emph{implementation-defined} mechanism, which could be:
The bus implementation obtains a bus's revision, maximum message size and list
of features through an \emph{implementation-defined} mechanism, which could be:
\begin{itemize}
\item A device tree or firmware method providing bus configuration,
\item A message exchange during bus setup,
\item A per bus instance list of properties,
\item A static definition built into the driver for a known environment.
\end{itemize}

After learning these parameters, the driver \emph{MUST} respect them for all
messages involving that bus instance. For example, it \emph{MUST NOT} send a
message exceeding the \textbf{maximum message size}, and it \emph{MUST} avoid
using advanced features or messages unavailable in the bus's advertised
After learning these parameters, the transport layer \emph{MUST} respect them
for all messages involving that bus instance. For example, it \emph{MUST NOT}
send a message exceeding the \textbf{maximum message size}, and it \emph{MUST}
avoid using advanced features or messages unavailable in the bus's advertised
\textbf{transport revision}.

\paragraph{virtio-msg revisions}

The following tables defines the currently defined virtio-msg revisions:
The following table lists the currently defined virtio-msg revisions:

\begin{tabular}{ |l|l|l|l| }
\hline
Expand All @@ -153,7 +154,7 @@ \subsubsection{Transport Revisions and Maximum Message Size}
correspond to a change in the virtio-msg revision.

The maximum message size is from the common transport level point of
view and includes the headers and payload described here. If the bus adds it
view and includes the headers and payload described here. If the bus adds its
own overhead (e.x. its own header) this is not included in the maximum message
size. The maximum useful message size is currently expected to be 274.
This value is large enough to support a GET_CONFIG or SET_CONFIG message with a
Expand Down Expand Up @@ -188,8 +189,8 @@ \subsubsection{Configuration Generation Count}
or the response to \msgref{GET_CONFIG} which also both provide the device's
current configuration count. The device may change any amount of data for one
generation count increment. If the change cannot fit in one \msgref{EVENT_CONFIG}
message, it \emph{SHOULD} be signaled to the driver via a \msgref{EVENT_CONFIG}
message with a zero data length and the new generation count.
message, the device \emph{SHOULD} send an \msgref{EVENT_CONFIG} message
with a zero data length and the new generation count to the driver.
The device \emph{MUST NOT} provide the same generation count in
multiple \msgref{EVENT_CONFIG} messages that contain non-zero length config
data. The driver includes its view of the current generation count in
Expand Down Expand Up @@ -218,7 +219,7 @@ \subsubsection{Feature Negotiation Blocks}
\msgref{SET_DRIVER_FEATURES}. Each block corresponds to up to 32 features:

\begin{itemize}
\item \textbf{Block Index}: Identifies the starting block (e.g., block 0 for
\item \textbf{Block Index}: The starting block (e.g., block 0 for
features 0--31, block 1 for features 32--63, etc.).
\item \textbf{Number of Blocks}: How many blocks the driver wishes to retrieve
or modify in a single message.
Expand All @@ -232,16 +233,31 @@ \subsubsection{Feature Negotiation Blocks}
\subsubsection{Error Signaling}
\label{sec:Virtio Transport Options / Virtio Over Messages / Basic Concepts / ErrorSignaling}

All legal transactions are defined at the transport level and responses defined.
If the transport level does something invalid or the bus has error conditions,
this \emph{SHOULD} be handled at the bus implementation level.
Errors may arise from: (a) malformed or unsupported transport messages, (b)
transmission or routing issues within a bus implementation, or (c) device-side
failures while processing a valid request. Local detection and recovery are
preferred, but a virtio-msg bus \textbf{MAY} report transmission errors to the
virtio-msg transport when it cannot deliver a request or obtain a response
within a bounded policy.

How the bus recovers from an error (e.g., by retrying, resetting
devices, or escalating to a bus-wide reset) is environment-specific, but
\emph{MUST} adhere to any mandatory behaviors (see
\ref{sec:Virtio Transport Options / Virtio Over Messages / Bus Operation}
and
\ref{sec:Virtio Transport Options / Virtio Over Messages / Device Operation}).
The following rules apply:
\begin{itemize}
\item A bus implementation \textbf{MAY} surface a transport-visible failure
(implementation-defined) after exhausting any bounded retry policy for
a transmission error.
\item Malformed headers or unsupported \field{msg_id} values \textbf{SHOULD}
be discarded; the receiver \textbf{MAY} log them and \textbf{SHOULD NOT}
generate further protocol traffic in response.
\item Event (one-way) messages \textbf{MUST NOT} elicit an error response.
\item Recovery actions (retry, selective reset, device removal) are
environment-specific but \textbf{MUST} comply with any normative reset
or status handling semantics described in
\ref{sec:Virtio Transport Options / Virtio Over Messages / Device Operation}.
\end{itemize}

This specification does not mandate a specific error reporting message for
transmission failures; it only permits a virtio-msg bus to surface such
failures to the virtio-msg transport when silent recovery is not feasible.

\subsubsection{Bus vs. Transport Messages}
\label{sec:Virtio Transport Options / Virtio Over Messages / Basic Concepts / BusVsTransport}
Expand All @@ -255,7 +271,7 @@ \subsubsection{Bus vs. Transport Messages}
or assessing bus-wide health (\busref{PING}).
These messages are \emph{optional} in environments where
device discovery or state changes occur through other means (e.g., device
tree). However, if a bus \emph{chooses} to handle those tasks via messages,
tree). However, if a bus chooses to handle those tasks via messages,
it \emph{should} implement the appropriate bus message definitions.

\item[\textbf{Transport Messages}:]
Expand All @@ -280,45 +296,85 @@ \subsubsection{Bus vs. Transport Messages}
\subsubsection{Endianness}
\label{sec:Virtio Transport Options / Virtio Over Messages / Basic Concepts / Endianness}

All encoding of values and fields defines in the virtio-msg messages \emph{MUST}
All values and fields defined in the virtio-msg messages \emph{MUST}
be encoded in little-endian.

\subsubsection{Common Message Format}
\label{sec:Virtio Transport Options / Virtio Over Messages / Basic Concepts / Common Message Format}

All virtio-msg exchanges, whether \emph{bus messages} or \emph{transport messages},
begin with a shared header that indicates how the recipient should parse the
rest of the payload. This header has the following format:
All virtio-msg exchanges, whether \emph{bus messages} or
\emph{transport messages}, begin with an 8 byte header followed by an optional
payload.

The header layout is:
\begin{lstlisting}
struct virtio_msg_message {
uint8_t type;
uint8_t msg_id;
uint16_t dev_num;
uint16_t msg_size;
u8 payload[];
struct virtio_msg_header {
u8 type; /* request/response + bus/transport */
u8 msg_id; /* message id */
le16 dev_num; /* device number (0 for bus messages) */
le16 token; /* correlation identifier (0 for events) */
le16 msg_size; /* total size: header (8) + payload */
u8 payload[];
};
\end{lstlisting}

The fields in this header have the following usage:
Field semantics:
\begin{itemize}
\item \field{type}:
\begin{itemize}
\item Bit[0]: Identifies if a message is a request (0) or a response
to a request (1).
\item Bit[1]: Identifies if a message is a Transport Message (0) or a
Bus Message (1).
\item Bit[2-7] Are reserved for future use and must be zero.
\item Bit[0]: 0=request, 1=response.
\item Bit[1]: 0=Transport Message, 1=Bus Message.
\item Bits[2..7]: \textbf{MUST} be zero; receivers \textbf{MUST} ignore.
\end{itemize}
\item \field{msg_id}:
Uniquely identifies which message definition applies (e.g., GET_DEVICES,
GET_DEVICE_FEATURES, SET_CONFIG). The specific range or enumeration of types is
defined in sections \ref{sec:Virtio Transport Options / Virtio Over Messages / Transport Messages}
and \ref{sec:Virtio Transport Options / Virtio Over Messages / Bus Messages}.
\item \field{dev_num}:
Identifies the Device Number the message is targeting or is coming from for
Transport Message and must be zero of Bus messages.
\item \field{msg_size};
Indicates the total length of the message payload including the header.
\item \field{msg_id}: Message ID identifying the message definition. Ranges
are defined in
\ref{sec:Virtio Transport Options / Virtio Over Messages / Transport Messages}
and
\ref{sec:Virtio Transport Options / Virtio Over Messages / Bus Messages}.
\item \field{dev_num}: For Transport Messages, the target device number; for
Bus Messages \textbf{MUST} be zero.
\item \field{token}: Non-zero for requests (which expect a response); zero
\textbf{MUST} be used only for event (one-way) messages. Responses
\textbf{MUST} echo the request's \field{token}.
\item \field{msg_size}: Total size in bytes of the complete message (header +
payload). \textbf{MUST} be \(\ge 8\) and \textbf{MUST NOT} exceed the
bus's maximum message size.
\item \field{payload}: Operation-specific data. Unused trailing bytes (if any
introduced by a bus framing) \textbf{MUST} be zero and \textbf{MUST} be
ignored by receivers.
\end{itemize}

All reserved header bits and any unspecified header values \textbf{MUST} be
sent as zero and \textbf{MUST} be ignored on receive to preserve forward
compatibility.

\subsubsection{Message Correlation}
\label{sec:Virtio Transport Options / Virtio Over Messages / Basic Concepts / Correlation}

Messages are either requests or events. Requests require a response. Events are
one way and do not have a response. Most requests defined today originate at the
driver side but bus message requests such as \busref{PING} may originate at
either driver side or device side.

This section defines how responses are correlated to requests. An implementation
does not need to support sending multiple requests in parallel but these rules
allow for that possibility.

The token field in the message header is part of a tuple that is unique during
a request to response interval.
\begin{itemize}
\item Message tuple: ( \field{dev_num}, \field{token} ).
\end{itemize}

Rules:
\begin{itemize}
\item The request originator assigns a non-zero \field{token} for every
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have something unclear here that we should define: who is generating the token and handling the correlation ? All models are possible:

  1. transport generating the token and doing correlation
  2. bus generating token and doing correlation
  3. transport generating token and bus doing correlation
  4. bus generating token and transport doing correlation

In my view solutions 3 and 4 do not make sense as the one doing correlation must be in charge of generating the token.

I see some arguments to consider:

  • a bus that only handles one message a time does not need the token -> in favor of 2
  • a bus doing the correlation would want to generate itself the token, same for transport -> against 3 and 4
  • having the token and correlation handled at transport level would allow to reuse some code and simplify the bus work -> in favor of 1
  • in case of 1, the bus would still have to handle its own correlation for bus messages -> in favor of 2
  • OS blocking semantics for drivers might be easier to handle in the transport to prevent busses to potentially block where they should not -> in favor of 1

I am in favor of 1, even if the bus would have to handle correlation for its own messages because of the OS blocking semantics arguments but definitely something open for discussion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today AMP does the thread resumption in the bus. Resuming the right thread means matching the token. So this is number 2 above. This is required from the nature of the interface between transport and bus: one call with pointer to tx data and (optional) pointer to rx buffer.

If we change to number 1 we would need to change the interface between transport and bus.

Of course this choice can be made per implementation. Linux could do it one way and BSD or RTOS a different way. So I would prefer to keep it out of the virtio spec.

Copy link
Collaborator Author

@wmamills wmamills Nov 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this discussion to Issue #26 so we can find it easily after this PR is closed

request such that the tuple is unique for all inflight requests.
\item Event (one-way) messages \textbf{MUST} set \field{token}=0
\item A response \textbf{MUST} echo \field{token} and \field{dev_num}.
\item Reception of unknown or already completed correlation tuples
\textbf{SHOULD} result in discarding the response without further protocol
action.
\end{itemize}

\subsection{Bus Operation}
Expand Down Expand Up @@ -627,7 +683,7 @@ \subsubsection{Device Notifications}
These notifications may be the result of:

\begin{itemize}
\item The same messages received in band on the message channel from the
\item The same messages received in-band on the message channel from the
device side bus.
\item Manufactured by the driver side bus based on reception of an
out-of-band (OoB) notification from the device side. Example OoB
Expand All @@ -650,7 +706,8 @@ \subsubsection{Device Notifications}
\paragraph{EVENT\_USED}
\begin{itemize}
\item Signifies that one or more buffers in a specific virtqueue have been
processed or consumed by the device.
processed or consumed by the device and the buffer has been added to the
used ring of the virtqueue.
\item The driver uses normal virtio methods (e.g., reading the "used" ring) to
identify which buffers are complete.
\item If a device does not support sending \msgref{EVENT_USED}, the driver
Expand Down Expand Up @@ -772,7 +829,7 @@ \subsubsection{Overview}
\end{itemize}

The functionality of the following messages \textbf{MUST} be provided by
in band messages, out of band event notification, or bus implementation based
in-band messages, out-of-band event notification, or bus implementation based
polling:
\begin{itemize}
\item \msgref{EVENT_AVAIL}
Expand Down Expand Up @@ -945,9 +1002,10 @@ \subsubsection{Overview}
Answer & 0 & 4 & Virtqueue index \\
& 4 & 4 & Maximum virtqueue size \\
& 8 & 4 & Current virtqueue size \\
& 12 & 8 & Descriptor address \\
& 20 & 8 & Driver address \\
& 28 & 8 & Device address \\
& 12 & 4 & Reserved (Must Be Zero - MBZ) \\
& 16 & 8 & Descriptor address \\
& 24 & 8 & Driver address \\
& 32 & 8 & Device address \\
\hline
\end{tabular}

Expand All @@ -966,9 +1024,10 @@ \subsubsection{Overview}
Request & 0 & 4 & Virtqueue index \\
& 4 & 4 & Reserved (Must Be Zero - MBZ) \\
& 8 & 4 & Current virtqueue size \\
& 12 & 8 & Descriptor address \\
& 20 & 8 & Driver address \\
& 28 & 8 & Device address \\
& 12 & 4 & Reserved (Must Be Zero - MBZ) \\
& 16 & 8 & Descriptor address \\
& 24 & 8 & Driver address \\
& 32 & 8 & Device address \\
\hline
Answer & 0 & 0 & no extra data \\
\hline
Expand Down Expand Up @@ -1056,8 +1115,8 @@ \subsubsection{Overview}
The \textbf{Next wrap} field is the MSB of the 32 bit value. The
\textbf{Next offset} field is the other 31 bits. These fields should be 0 if
the VIRTIO_F_NOTIFICATION_DATA feature has not been negotiated. If the bus
implementation is using out-of-band notifications, it should refuse to allow
this feature to be negotiated.
implementation is using out-of-band notifications, it should prevent this
feature from being negotiated.

\msgdef{EVENT_USED}

Expand Down Expand Up @@ -1099,7 +1158,7 @@ \subsection{Bus Messages}\label{sec:Virtio Transport Options / Virtio Over Messa
\hline
\end{tabular}

Bus message IDs below 0x80 are reserved for standardizes (but optional) bus
Bus message IDs below 0x80 are reserved for standardized (but optional) bus
messages. A few are used here and more are expected in the future. Bus message
IDs below 0x40 are used for request/response messages and 0x40 and above for
event messages.
Expand Down Expand Up @@ -1265,26 +1324,25 @@ \subsubsection{Optional Requirements}
and \busref{EVENT_DEVICE} for discovering and managing devices in a
message-driven manner. However, this is not mandatory if other enumeration
methods (e.g., device tree, ACPI, hypervisor firmware) are used.
\item If a bus \emph{chooses} to implement these messages, it \textbf{MUST} do
\item If a bus chooses to implement these messages, it \textbf{MUST} do
so in compliance with their defined formats and semantics (see
\ref{sec:Virtio Transport Options / Virtio Over Messages / Bus Messages}).
\end{itemize}

\paragraph{Optional Bus-Level Messages}
\begin{itemize}
\item \busref{PING} for keepalive or health checks is also \emph{MAY}
implement. If used, both sides \textbf{MUST} echo the 32-bit data field
precisely.
\item \busref{PING} \emph{MAY} be implemented for keepalive or health checks.
If used, both sides \textbf{MUST} echo the 32-bit data field precisely.
\end{itemize}

\paragraph{Runtime Notifications}
\begin{itemize}
\item A device or the driver side bus \emph{MUST} send \msgref{EVENT_CONFIG}
to inform the driver of configuration of device status changes.
\item A device or the driver side bus \emph{MUST} \msgref{EVENT_USED}
\item A device or the driver side bus \emph{MUST} send \msgref{EVENT_USED}
to inform the driver of (likely) buffer completions.
\item A driver \emph{MUST} send \msgref{EVENT_AVAIL} to notify the device that
new buffers are posted.
new buffers are available.
\end{itemize}

\subsubsection{Compliance for Different Environments}
Expand Down