Architecture

Hedgehog Open Network Fabric provides physical networking for compute and AI (GPU) clouds.

Certified Platforms

The Compute Cluster

The Compute Cluster contains your servers and processing units (GPUs, FPGAs, etc).  These devices are connected via high speed ethernet (10-800GbE) to top-of-rack switches.  The compute clusters can run any operating system and any application stack.

The Fabric Cluster

The Fabric Cluster is composed of Hedgehog SONiC network devices including ethernet switches, processing nodes, and service nodes.  The Fabric Cluster provides all of the necessary services to support one or more Compute Clusters.

Fabric Components

Open Network Fabric combines ethernet switches, traditional x86/x64 servers, and acceleration devices such as DPUs and SmartNICs.

Ethernet Switch Fabric

Switch nodes are ethernet switches that are running Hedgehog SONiC NOS.  10-400Gbps for the most demanding AI and ML applications.

Control Nodes

Control Nodes are lightweight compute devices responsible for running the Kubernetes control plane. They are created by the Fabric Designer and are typically connected to the management network to assist with ZTP/ONIE and attestation.

DPU/IPU/SmartNIC

These PCI cards are installed in Processing Nodes and provide acceleration resources to the Fabric Cluster.  These services typically include Load Balancers, Firewalls, VPNs and more.

Processing Nodes

Standard servers can be connected to the Fabric Cluster to provide additional network services, for example edge gateways, API gateways, and more.

Fabric Control Plane

The Kubernetes control plane distributes network services and configurations to each managed device. With this design, cloud infrastructure teams can update the network as easily as any cloud native application.

Extensible Fabric Operator

Hedgehog provides an abstraction for simplified operations and services.

operational model

services model

SOFTWARE

Initialization

Distribution

Resilience

Scaling

Smart Updates

CONFIG

Config Distribution

Failsafe/Fusing

Exceptions

NETWORK

Network Definition

Policy

Security Services

OPERATIONS

Ops Distribution

Observability

Debug Services

Legacy Integration

Infrastructure as Code

These models can be deployed and modified with Kubernetes CRDs.  VPC-like logical models can quickly be created and deployed with all the necessary services.

Fabric TopologyYAML

apiVersion: fabric.githedgehog.com/v1alpha1
kind: VPC
metadata:
name: vpc1
spec:
ipAddressBlock: 192.169.0.0/16
---
apiVersion: fabric.githedgehog.com/v1alpha1
kind: ServerPort
metadata:
name: server1-port1
spec:
unbundled:
- nicName: eth0
nicPortIndex: 1
nicIndex: 0
neighbor:
- switch:
- name: switch1
port: port1
---
apiVersion: fabric.githedgehog.com/v1alpha1
kind: VPCMember
metadata:
labels:
fabric.githedgehog.com/server: server1
fabric.githedgehog.com/vpc: vpc1
name: vpc1-server1

...

Programmability

A Hedgehog network is a real Kubernetes cluster, which means you can use all of your favorite CI/CD methods to operate it.