Conversation
cloudManualExternal.md
Outdated
|
|
||
| ### The Instance Type Syncer | ||
|
|
||
| Brev runs a **continuous synchronization process** that periodically queries your API to understand what compute is available. This isn't a one-time import—it's an ongoing reconciliation. |
There was a problem hiding this comment.
Can we remove the LLM-speak?
This isn't a one-time import—it's an ongoing reconciliation.
cloudManualExternal.md
Outdated
| │ │ | ||
| └──────────────────────────────────────────────────────────────────────────────────┘ | ||
|
|
||
| --- |
There was a problem hiding this comment.
This line look superfluous.
cloudManualExternal.md
Outdated
| │ │ ┌─────────┐ ┌───────────┐ ┌─────────▼───┐ ┌───────────┐ ┌──────────────┐ │ │ | ||
| │ │ │ A │ │ │ B │ │ C │ │ D │ │ E │ │ │ | ||
| │ │ │ Provider│ │ Provider │ │ Provider │ │ Provider │ │ Provider │ │ │ | ||
| │ │ └────┬────┘ └─────┬─────┘ └──────┬──────┘ └─────┬─────┘ └──────┬───────┘ │ │ |
There was a problem hiding this comment.
Some visual quirks here (the arrow above "C", the extra "|" in the "A" box, the line under "E").
cloudManualExternal.md
Outdated
| ┌─────────────────────────────────────────────────────────────────────────────────┐ | ||
| │ Brev Control Plane │ | ||
| │ ┌───────────────────────────────────────────────────────────────────────────┐ │ | ||
| │ │ Syncer Layer │ │ |
There was a problem hiding this comment.
What about the rest of the integration (direct API calls, not just syncing)?
cloudManualExternal.md
Outdated
| Brev runs a **continuous synchronization process** that periodically queries your API to understand what compute is available. This isn't a one-time import—it's an ongoing reconciliation. | ||
|
|
||
| **Sync Behavior:** | ||
| - Polls your instance type listing API at regular intervals (typically every 1-5 minutes) |
There was a problem hiding this comment.
The polling is up to the user -- there is a default, but we should probably state that the poll rate is up to the implementer.
cloudManualExternal.md
Outdated
| Brev may set tags/labels on instances for identification: | ||
|
|
||
| | Tag | Value | Purpose | | ||
| |-----|-------|---------| | ||
| | `brev-instance-id` | Brev's internal ID | Cross-reference | | ||
| | `Name` | User-specified | Display name | |
There was a problem hiding this comment.
This is up to the implementer.
cloudManualExternal.md
Outdated
| - Updating tags on running instances | ||
| - Querying instances by tag (helpful but not required) | ||
|
|
||
| If you don't support tags, we track the mapping on our side. |
cloudManualExternal.md
Outdated
| | **Out of Stock** | No capacity in region | Return specific error code | | ||
| | **Quota Exceeded** | Hit account limit | Return quota error | | ||
| | **Invalid Request** | Bad instance type | Return validation error | | ||
| | **Auth Failed** | Bad API key | Return 401/403 | | ||
| | **Internal Error** | Your system issue | Return 500 with details | |
There was a problem hiding this comment.
We should link to the actual consts here.
cloudManualExternal.md
Outdated
| ### Out of Stock Handling | ||
|
|
||
| "Out of stock" is common with GPUs. Ideal handling: | ||
| 1. Your API returns a clear "no capacity" error |
There was a problem hiding this comment.
We have a very specific error we need to do so. We should indicate that here.
cloudManualExternal.md
Outdated
| | Field | Format | Example | | ||
| |-------|--------|---------| | ||
| | **Hourly price** | Cents (integer) | `3200` = $32.00/hr | | ||
| | **Currency** | USD assumed | - | |
cloudManualExternal.md
Outdated
|
|
||
| ## 10. Instance Metadata and Tags | ||
|
|
||
| Brev uses tags to track and correlate instances. Your API must support setting tags at creation and reading them back. |
There was a problem hiding this comment.
line 761 says the Tag Capability is optional, but we require it here. Either move it to required or clarify that tagging and this tagging are different?
There was a problem hiding this comment.
Yep made some clarification. Great Catch
|
|
||
| **Option 2: Define Your Own Format** | ||
|
|
||
| If you prefer a different ID format, set `ID` directly: |
There was a problem hiding this comment.
point to ValidateStableInstanceTypeIDs to get confidence in your type creation
|
|
||
| > **Warning**: This is the most common cause of integration failures. Instance types may sync successfully but instances fail to provision or appear "orphaned." | ||
|
|
||
| When Brev provisions an instance, it looks up the corresponding instance type using the instance's `InstanceTypeID`. **These IDs must match exactly.** |
There was a problem hiding this comment.
Point to making sure ValidateCreateInstance is run in test suite to gain confidence. There we validate that the types on instance and instance type are the same
cloudManualExternal.md
Outdated
|
|
||
| ### Why GPU Normalization Matters | ||
|
|
||
| Users search for GPUs by model. They want "H100" not "NVIDIA H100 80GB HBM3 SXM5 Accelerator". Your provider implementation must normalize GPU data into the SDK's structured `GPU` type. |
There was a problem hiding this comment.
We need to actually define the normalization more explicitly. We say that it matters and to normalize but we don't define it beyond base model uppercase. Base model is obvious for some but is unclear for others like A100. We should consider providing in code mapping of all the names or a stronger formalization of gpu name = NVIDIA GPU family point to doc.
| ### Key Points | ||
|
|
||
| - `Name`: base model only (`"H100"` not `"NVIDIA H100 80GB"`) | ||
| - `NetworkDetails`: `"SXM"`, `"SXM4"`, `"SXM5"`, or `"PCIe"` |
There was a problem hiding this comment.
would be useful to also formalize this string
cloudManualExternal.md
Outdated
| 1. Accept this public key in your create instance API | ||
| 2. Install it in the VM's default user `~/.ssh/authorized_keys` before the instance becomes accessible | ||
|
|
||
| Brev generates a unique SSH key pair for each instance. The control plane retains the private key and uses it to connect after creation. |
There was a problem hiding this comment.
this is not the case, it is per user pub key
There was a problem hiding this comment.
how about we say that brev manages the ssh key per instance
|
|
||
| ### Start Instance (Optional) | ||
|
|
||
| **Capability:** `CapabilityStopStartInstance` |
|
|
||
| Brev checks capabilities before calling optional methods. If you don't declare a capability, Brev won't attempt that operation. | ||
|
|
||
| > **Note on `CapabilityTags`:** This capability is optional, but `RefID` and `CloudCredRefID` data is **required** regardless. If your API doesn't support tags, you must use an alternative mechanism to store and retrieve this data. See [Section 10: Instance Metadata and Tags](#10-instance-metadata-and-tags) for details and examples. |
There was a problem hiding this comment.
This is super important. I think its kind of irrelevant to Tags, we need to put it in more places(in comments on struct) that the instance refid and cloudcredrefid is mandatory. You just have to find a way to return it
| ```go | ||
| // Core pattern | ||
| commands := []string{ | ||
| "ufw --force reset", |
There was a problem hiding this comment.
This needs to be updated with latest commands. lets point to the new tests that capture this requirement. Or we should just extract this logic into a library (nebius and shadeform) and tell folks to use it if its a user managed firewall.
|
|
||
| ### Required Instance Data | ||
|
|
||
| These values **MUST** be stored with the instance and returned in `GetInstance`/`ListInstances`: |
There was a problem hiding this comment.
Let point to the tests that check this I believe its ValidateCreateInstance
| **Example (Lambda Labs without tags):** | ||
| ```go | ||
| // At creation - encode CloudCredRefID in instance name | ||
| name := fmt.Sprintf("%s--%s", c.GetReferenceID(), time.Now().UTC().Format(timeFormat)) |
There was a problem hiding this comment.
This is helpful. We can point to lambda labs example here. The datetime might throw people off. Thats because lambdalabs api does not have create time on its instance struct which is also helpful for us to store.
9a36471 to
eb26ffd
Compare
Documentation for third-party cloud providers integrating with Brev's
control plane. Covers the complete integration requirements including
API patterns, instance type modeling, credential configuration, SSH key
injection, provisioning lifecycle, and network/firewall requirements.