Architecture
Hedgehog Open Network Fabric provides physical networking for compute and AI (GPU) clouds.
Certified Platforms
The Compute Cluster
The Compute Cluster contains your servers and processing units (GPUs, FPGAs, etc). These devices are connected via high speed ethernet (10-800GbE) to top-of-rack switches. The compute clusters can run any operating system and any application stack.
The Fabric Cluster
The Fabric Cluster is composed of Hedgehog SONiC network devices including ethernet switches, processing nodes, and service nodes. The Fabric Cluster provides all of the necessary services to support one or more Compute Clusters.
Fabric Components
Open Network Fabric combines ethernet switches, traditional x86/x64 servers, and acceleration devices such as DPUs and SmartNICs.
Ethernet Switch Fabric
Switch nodes are ethernet switches that are running Hedgehog SONiC NOS. 10-400Gbps for the most demanding AI and ML applications.
Control Nodes
Control Nodes are lightweight compute devices responsible for running the Kubernetes control plane. They are created by the Fabric Designer and are typically connected to the management network to assist with ZTP/ONIE and attestation.
DPU/IPU/SmartNIC
These PCI cards are installed in Processing Nodes and provide acceleration resources to the Fabric Cluster. These services typically include Load Balancers, Firewalls, VPNs and more.
Processing Nodes
Standard servers can be connected to the Fabric Cluster to provide additional network services, for example edge gateways, API gateways, and more.
Fabric Control Plane
The Kubernetes control plane distributes network services and configurations to each managed device. With this design, cloud infrastructure teams can update the network as easily as any cloud native application.
Extensible Fabric Operator
Hedgehog provides an abstraction for simplified operations and services.
operational model
services model
SOFTWARE
Initialization
Distribution
Resilience
Scaling
Smart Updates
CONFIG
Config Distribution
Failsafe/Fusing
Exceptions
NETWORK
Network Definition
Policy
Security Services
OPERATIONS
Ops Distribution
Observability
Debug Services
Legacy Integration
Infrastructure as Code
These models can be deployed and modified with Kubernetes CRDs. VPC-like logical models can quickly be created and deployed with all the necessary services.
Fabric TopologyYAML
apiVersion: fabric.githedgehog.com/v1alpha1
kind: VPC
metadata:
name: vpc1
spec:
ipAddressBlock: 192.169.0.0/16
---
apiVersion: fabric.githedgehog.com/v1alpha1
kind: ServerPort
metadata:
name: server1-port1
spec:
unbundled:
- nicName: eth0
nicPortIndex: 1
nicIndex: 0
neighbor:
- switch:
- name: switch1
port: port1
---
apiVersion: fabric.githedgehog.com/v1alpha1
kind: VPCMember
metadata:
labels:
fabric.githedgehog.com/server: server1
fabric.githedgehog.com/vpc: vpc1
name: vpc1-server1
...
Programmability
A Hedgehog network is a real Kubernetes cluster, which means you can use all of your favorite CI/CD methods to operate it.