HashiCorp Nomad - Orchestrateur Simple et Efficace
Introduction
HashiCorp Nomad est un orchestrateur de workloads flexible qui se positionne comme une alternative plus simple à Kubernetes pour de nombreux cas d'usage. Contrairement à K8s, Nomad peut orchestrer non seulement des conteneurs, mais aussi des applications natives, des VMs et des tâches batch.
Philosophie Nomad
"Simple, flexible et production-ready dès l'installation - sans la complexité de Kubernetes."
Pourquoi Nomad plutôt que Kubernetes ?
Avantages de Nomad
- Installation : Binaire unique, aucune dépendance
- Configuration : Fichiers HCL simples et lisibles
- Courbe d'apprentissage : Beaucoup plus accessible
- Maintenance : Moins de composants à gérer
- Multi-workload : Conteneurs, binaires, VMs, Java, etc.
- Multi-platform : Linux, Windows, macOS
- Multi-cloud : AWS, Azure, GCP, on-premise
- Hybrid : Mix conteneurs/VMs dans même cluster
- Léger : Consommation mémoire/CPU réduite
- Rapide : Déploiements plus rapides que K8s
- Efficace : Bin packing intelligent des ressources
- Scalable : Jusqu'à 10k nodes facilement
- Stable : Moins de breaking changes
- Secure : TLS mutuel natif, ACLs intégrées
- Observable : Métriques et logs natifs
- Resilient : Auto-healing et rolling updates
Comparaison détaillée
Critère | Nomad | Kubernetes | Docker Swarm |
---|---|---|---|
Complexité | ⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐ |
Flexibilité | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐ |
Écosystème | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ |
Performance | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ |
Multi-workload | ⭐⭐⭐⭐⭐ | ⭐⭐ | ⭐ |
Learning curve | ⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐ |
Architecture Nomad
graph TB
subgraph "Nomad Cluster"
subgraph "Server Nodes (3+)"
Leader["Leader Server<br/>🎯 Scheduler"]
Server2["Server 2<br/>🔄 Follower"]
Server3["Server 3<br/>🔄 Follower"]
Leader -.->|Raft Consensus| Server2
Leader -.->|Raft Consensus| Server3
end
subgraph "Client Nodes"
Client1["Client Node 1<br/>🖥️ Worker"]
Client2["Client Node 2<br/>🖥️ Worker"]
Client3["Client Node 3<br/>🖥️ Worker"]
end
subgraph "Jobs & Allocations"
Job1["Web App Job<br/>📦 3 replicas"]
Job2["API Job<br/>🔌 2 replicas"]
Job3["Worker Job<br/>⚙️ 5 replicas"]
end
end
subgraph "HashiCorp Stack Integration"
Consul["Consul<br/>🔍 Service Discovery"]
Vault["Vault<br/>🔐 Secrets Management"]
Terraform["Terraform<br/>🏗️ Infrastructure"]
end
%% Connections
Leader --> Client1
Leader --> Client2
Leader --> Client3
Client1 --> Job1
Client2 --> Job2
Client3 --> Job3
%% Stack integration
Leader -.->|Service Registration| Consul
Client1 -.->|Secret Injection| Vault
Terraform -.->|Provision| Leader
classDef server fill:#4f46e5,stroke:#3730a3,color:#fff
classDef client fill:#10b981,stroke:#047857,color:#fff
classDef job fill:#f59e0b,stroke:#d97706,color:#000
classDef stack fill:#ef4444,stroke:#dc2626,color:#fff
class Leader,Server2,Server3 server
class Client1,Client2,Client3 client
class Job1,Job2,Job3 job
class Consul,Vault,Terraform stack
Installation & Configuration
Installation Nomad
# /etc/systemd/system/nomad.service
[Unit]
Description=Nomad
Documentation=https://www.nomadproject.io/
Wants=network-online.target
After=network-online.target
ConditionFileNotEmpty=/etc/nomad.d/nomad.hcl
[Service]
Type=notify
User=nomad
Group=nomad
ExecStart=/usr/local/bin/nomad agent -config=/etc/nomad.d/
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
Configuration Server
# Configuration Nomad Server
datacenter = "dc1"
data_dir = "/opt/nomad/data"
log_level = "INFO"
node_name = "nomad-server-1"
bind_addr = "0.0.0.0"
server {
enabled = true
bootstrap_expect = 3 # Nombre de servers dans le cluster
# Encrypt gossip communications
encrypt = "cg8StVXbQJ0gPvMd9o7yrg=="
# Server join configuration
server_join {
retry_join = ["nomad-server-1:4648", "nomad-server-2:4648", "nomad-server-3:4648"]
}
}
# ACLs (optionnel mais recommandé)
acl {
enabled = true
}
# TLS Configuration (production)
tls {
http = true
rpc = true
ca_file = "/etc/nomad.d/certs/ca.pem"
cert_file = "/etc/nomad.d/certs/server.pem"
key_file = "/etc/nomad.d/certs/server-key.pem"
verify_server_hostname = true
verify_https_client = true
}
# Integration Consul
consul {
address = "127.0.0.1:8500"
# Auto-advertise services
auto_advertise = true
server_auto_join = true
client_auto_join = true
# Service tags
tags = ["nomad", "server"]
}
# Métriques
telemetry {
collection_interval = "10s"
disable_hostname = true
prometheus_metrics = true
publish_allocation_metrics = true
publish_node_metrics = true
}
# Web UI
ui_config {
enabled = true
# Consul/Vault integration in UI
consul {
ui_url = "http://consul.service.consul:8500/ui"
}
vault {
ui_url = "http://vault.service.consul:8200/ui"
}
}
Configuration Client
# Configuration Nomad Client
datacenter = "dc1"
data_dir = "/opt/nomad/data"
log_level = "INFO"
node_name = "nomad-client-1"
bind_addr = "0.0.0.0"
client {
enabled = true
# Server addresses to join
servers = ["nomad-server-1:4647", "nomad-server-2:4647", "nomad-server-3:4647"]
# Node configuration
node_class = "compute"
# Metadata for job constraints
meta {
"type" = "compute"
"zone" = "us-west-1a"
"instance_type" = "m5.large"
}
# Resource configuration
reserved {
cpu = 500 # MHz reserved for system
memory = 512 # MB reserved for system
disk = 1024 # MB reserved for system
}
# Network configuration
network_interface = "eth0"
# Host volumes
host_volume "docker-sock" {
path = "/var/run/docker.sock"
read_only = true
}
host_volume "logs" {
path = "/var/log"
read_only = false
}
}
# Plugin configuration
plugin "docker" {
config {
allow_privileged = false
allow_caps = ["audit_write", "chown", "dac_override"]
# Resource limits
gc {
image = true
image_delay = "10m"
container = true
}
# Volume mounts
volumes {
enabled = true
}
}
}
plugin "raw_exec" {
config {
enabled = false # Disabled by default for security
}
}
# Consul integration
consul {
address = "127.0.0.1:8500"
auto_advertise = true
client_auto_join = true
tags = ["nomad", "client"]
}
# Vault integration
vault {
enabled = true
address = "http://vault.service.consul:8200"
# Task identity
task_token_ttl = "1h"
create_from_role = "nomad-cluster"
}
Jobs & Workloads
Job Web Application
job "web-app" {
datacenters = ["dc1"]
type = "service"
# Update strategy
update {
max_parallel = 2
min_healthy_time = "10s"
healthy_deadline = "3m"
progress_deadline = "10m"
auto_revert = true
canary = 2
}
group "frontend" {
count = 3
# Networking
network {
port "http" {
to = 8080
}
}
# Service discovery
service {
name = "web-app"
port = "http"
tags = [
"frontend",
"traefik.enable=true",
"traefik.http.routers.webapp.rule=Host(`app.example.com`)"
]
check {
type = "http"
path = "/health"
interval = "10s"
timeout = "2s"
}
}
# Restart policy
restart {
attempts = 3
interval = "5m"
delay = "25s"
mode = "fail"
}
# Task definition
task "web" {
driver = "docker"
config {
image = "nginx:alpine"
ports = ["http"]
mount {
type = "bind"
source = "local/nginx.conf"
target = "/etc/nginx/nginx.conf"
}
}
# Template configuration
template {
data = <<EOH
events {
worker_connections 1024;
}
http {
upstream backend {
{{range service "api"}}
server {{.Address}}:{{.Port}};
{{end}}
}
server {
listen 8080;
location /api/ {
proxy_pass http://backend/;
}
location /health {
return 200 "OK";
}
}
}
EOH
destination = "local/nginx.conf"
change_mode = "restart"
}
# Resources
resources {
cpu = 100 # MHz
memory = 128 # MB
}
# Environment variables from Vault
vault {
policies = ["web-app"]
}
template {
data = <<EOH
{{with secret "secret/web-app"}}
API_KEY="{{.Data.api_key}}"
DB_PASSWORD="{{.Data.db_password}}"
{{end}}
EOH
destination = "secrets/app.env"
env = true
}
}
}
}
Job API Backend
job "api" {
datacenters = ["dc1"]
type = "service"
group "backend" {
count = 2
network {
port "api" {
to = 3000
}
}
service {
name = "api"
port = "api"
tags = ["backend", "api"]
check {
type = "http"
path = "/health"
interval = "30s"
timeout = "5s"
}
}
task "api-server" {
driver = "docker"
config {
image = "node:18-alpine"
ports = ["api"]
command = "node"
args = ["server.js"]
work_dir = "/app"
mount {
type = "bind"
source = "local/app"
target = "/app"
}
}
# Artifact download
artifact {
source = "https://github.com/company/api/archive/v1.2.3.tar.gz"
destination = "local/app"
options {
checksum = "sha256:abc123..."
}
}
# Database connection
template {
data = <<EOH
{{with service "postgres"}}
{{with index . 0}}
DATABASE_URL="postgres://user:pass@{{.Address}}:{{.Port}}/mydb"
{{end}}
{{end}}
NODE_ENV="production"
PORT="3000"
EOH
destination = "local/app/.env"
change_mode = "restart"
}
resources {
cpu = 500
memory = 512
}
}
}
}
Job Batch Processing
job "data-processing" {
datacenters = ["dc1"]
type = "batch"
# Parameterized job
parameterized {
payload = "required"
meta_required = ["input_file", "output_bucket"]
}
group "processor" {
count = 1
restart {
attempts = 2
delay = "30s"
mode = "fail"
}
task "process" {
driver = "docker"
config {
image = "python:3.11-slim"
command = "python"
args = ["process.py", "${NOMAD_META_input_file}"]
}
# Script artifact
artifact {
source = "s3://my-bucket/scripts/process.py"
destination = "local/"
}
# AWS credentials from Vault
vault {
policies = ["data-processor"]
}
template {
data = <<EOH
{{with secret "aws/creds/data-processor"}}
AWS_ACCESS_KEY_ID="{{.Data.access_key}}"
AWS_SECRET_ACCESS_KEY="{{.Data.secret_key}}"
{{end}}
OUTPUT_BUCKET="${NOMAD_META_output_bucket}"
EOH
destination = "secrets/aws.env"
env = true
}
resources {
cpu = 1000
memory = 2048
}
}
}
}
Intégration HashiCorp Stack
Consul Service Discovery
job "microservice" {
group "app" {
network {
port "http" {}
port "grpc" {}
}
# Service principal
service {
name = "user-service"
port = "http"
tags = [
"api",
"version-v1.2.0",
"traefik.enable=true"
]
meta {
version = "1.2.0"
team = "platform"
}
# Health checks multiples
check {
name = "HTTP Health"
type = "http"
path = "/health"
interval = "10s"
timeout = "3s"
}
check {
name = "gRPC Health"
type = "grpc"
port = "grpc"
interval = "15s"
timeout = "3s"
}
}
# Service connect (service mesh)
service {
name = "user-service-sidecar-proxy"
port = "connect-proxy-user-service"
connect {
sidecar_service {
proxy {
upstreams {
destination_name = "database"
local_bind_port = 5432
}
upstreams {
destination_name = "auth-service"
local_bind_port = 8080
}
}
}
}
}
task "app" {
driver = "docker"
config {
image = "user-service:v1.2.0"
ports = ["http", "grpc"]
}
# Service discovery via DNS
template {
data = <<EOH
# Services disponibles via Consul DNS
DATABASE_HOST="database.service.consul"
AUTH_SERVICE_URL="http://auth-service.service.consul:8080"
CACHE_HOSTS="{{range service "redis"}}{{.Address}}:{{.Port}},{{end}}"
EOH
destination = "local/services.env"
env = true
}
}
}
}
Vault Secrets Management
job "secure-app" {
group "app" {
task "web" {
driver = "docker"
# Vault policy required
vault {
policies = ["app-policy"]
# Change mode when secrets rotate
change_mode = "restart"
change_signal = "SIGUSR1"
}
# Database credentials (dynamic)
template {
data = <<EOH
{{with secret "database/creds/app-role"}}
DB_USERNAME="{{.Data.username}}"
DB_PASSWORD="{{.Data.password}}"
{{end}}
EOH
destination = "secrets/db.env"
env = true
}
# Static secrets
template {
data = <<EOH
{{with secret "secret/app/config"}}
API_KEY="{{.Data.api_key}}"
ENCRYPTION_KEY="{{.Data.encryption_key}}"
{{end}}
EOH
destination = "secrets/app.env"
env = true
}
# TLS certificates
template {
data = <<EOH
{{with secret "pki/issue/app-role" "common_name=app.service.consul" "ttl=24h"}}
{{.Data.certificate}}
{{end}}
EOH
destination = "secrets/tls.crt"
perms = "400"
}
template {
data = <<EOH
{{with secret "pki/issue/app-role" "common_name=app.service.consul" "ttl=24h"}}
{{.Data.private_key}}
{{end}}
EOH
destination = "secrets/tls.key"
perms = "400"
}
config {
image = "secure-app:latest"
mount {
type = "bind"
source = "secrets/tls.crt"
target = "/etc/ssl/certs/app.crt"
}
mount {
type = "bind"
source = "secrets/tls.key"
target = "/etc/ssl/private/app.key"
}
}
}
}
}
Cas d'Usage Pratiques
1. Migration Kubernetes → Nomad
Contexte : Startup avec 50 microservices sur K8s, complexité excessive.
Bénéfices migration : - Réduction équipe ops : 3 → 1 personne - Coût infrastructure : -40% (moins d'overhead) - Time to market : déploiements 3x plus rapides - Réduction incidents : -60% (architecture plus simple)
Stratégie migration :
graph LR
K8s["Kubernetes<br/>50 services"] --> Hybrid["Migration Hybride<br/>6 mois"]
Hybrid --> Nomad["Nomad<br/>50 services"]
Hybrid --> Phase1["Phase 1<br/>Services stateless"]
Hybrid --> Phase2["Phase 2<br/>Bases de données"]
Hybrid --> Phase3["Phase 3<br/>Services critiques"]
2. Architecture Multi-Cloud
Objectif : Déploiement uniforme AWS + Azure + On-premise
job "global-app" {
# Multi-datacenter deployment
datacenters = ["aws-us-east-1", "azure-west-europe", "on-prem-dc1"]
# Contraintes par région
constraint {
attribute = "${meta.cloud_provider}"
operator = "set_contains_any"
value = "aws,azure,on-prem"
}
group "app" {
# 3 instances par datacenter
count = 3
# Spread across datacenters
spread {
attribute = "${node.datacenter}"
weight = 100
}
task "service" {
driver = "docker"
config {
image = "app:v1.0.0"
}
# Configuration par provider
template {
data = <<EOH
{{if eq (env "node.datacenter") "aws-us-east-1"}}
STORAGE_BACKEND="s3"
REGION="us-east-1"
{{else if eq (env "node.datacenter") "azure-west-europe"}}
STORAGE_BACKEND="blob"
REGION="west-europe"
{{else}}
STORAGE_BACKEND="nfs"
REGION="on-premise"
{{end}}
EOH
destination = "local/config.env"
env = true
}
}
}
}
3. Edge Computing & IoT
Scénario : Déploiement applications edge sur sites distants
job "edge-processor" {
datacenters = ["edge-*"] # Tous les datacenters edge
# Contraintes hardware edge
constraint {
attribute = "${meta.node_type}"
value = "edge"
}
constraint {
attribute = "${node.class}"
value = "edge-compute"
}
group "processor" {
# Une instance par site edge
count = 1
# Persistence locale
volume "edge-data" {
type = "host"
source = "edge-storage"
attachment_mode = "file-system"
access_mode = "single-node-writer"
}
task "data-processor" {
driver = "docker"
config {
image = "edge-processor:arm64-v1.0"
# Optimisé ARM64
platform = "linux/arm64"
}
volume_mount {
volume = "edge-data"
destination = "/data"
}
# Configuration spécifique site
template {
data = <<EOH
SITE_ID="${meta.site_id}"
LAT="${meta.latitude}"
LON="${meta.longitude}"
SYNC_INTERVAL="300s"
CLOUD_ENDPOINT="https://central.company.com/api"
EOH
destination = "local/site.env"
env = true
}
# Resources limitées edge
resources {
cpu = 200
memory = 256
}
}
}
}
Monitoring & Observabilité
Métriques Prometheus
job "monitoring-stack" {
group "prometheus" {
network {
port "prometheus" {
static = 9090
}
}
service {
name = "prometheus"
port = "prometheus"
tags = [
"monitoring",
"traefik.enable=true",
"traefik.http.routers.prometheus.rule=Host(`prometheus.company.com`)"
]
}
task "prometheus" {
driver = "docker"
config {
image = "prom/prometheus:latest"
ports = ["prometheus"]
args = [
"--config.file=/etc/prometheus/prometheus.yml",
"--storage.tsdb.path=/prometheus",
"--storage.tsdb.retention.time=30d",
"--web.console.libraries=/etc/prometheus/console_libraries",
"--web.console.templates=/etc/prometheus/consoles",
"--web.enable-lifecycle"
]
mount {
type = "bind"
source = "local/prometheus.yml"
target = "/etc/prometheus/prometheus.yml"
}
}
# Configuration Prometheus avec auto-discovery Nomad
template {
data = <<EOH
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- "nomad_rules.yml"
scrape_configs:
# Nomad servers
- job_name: 'nomad-servers'
consul_sd_configs:
- server: 'localhost:8500'
services: ['nomad']
tags: ['server']
relabel_configs:
- source_labels: [__meta_consul_service_port]
target_label: __address__
replacement: '${1}:4646'
metrics_path: /v1/metrics
params:
format: ['prometheus']
# Nomad clients
- job_name: 'nomad-clients'
consul_sd_configs:
- server: 'localhost:8500'
services: ['nomad-client']
relabel_configs:
- source_labels: [__meta_consul_service_port]
target_label: __address__
replacement: '${1}:4646'
metrics_path: /v1/metrics
params:
format: ['prometheus']
# Applications services
- job_name: 'services'
consul_sd_configs:
- server: 'localhost:8500'
relabel_configs:
- source_labels: [__meta_consul_service_metadata_metrics]
action: keep
regex: 'true'
- source_labels: [__meta_consul_service_metadata_metrics_path]
target_label: __metrics_path__
regex: '(.+)'
replacement: '${1}'
EOH
destination = "local/prometheus.yml"
change_mode = "restart"
}
resources {
cpu = 500
memory = 1024
}
}
}
group "grafana" {
network {
port "grafana" {
static = 3000
}
}
service {
name = "grafana"
port = "grafana"
tags = [
"monitoring",
"dashboard",
"traefik.enable=true",
"traefik.http.routers.grafana.rule=Host(`grafana.company.com`)"
]
}
task "grafana" {
driver = "docker"
config {
image = "grafana/grafana:latest"
ports = ["grafana"]
mount {
type = "bind"
source = "local/grafana.ini"
target = "/etc/grafana/grafana.ini"
}
}
# Configuration Grafana
template {
data = <<EOH
[server]
http_port = 3000
[database]
type = sqlite3
path = /var/lib/grafana/grafana.db
[security]
admin_user = admin
admin_password = ${GRAFANA_ADMIN_PASSWORD}
[auth.anonymous]
enabled = true
org_role = Viewer
[dashboards.json]
enabled = true
path = /var/lib/grafana/dashboards
EOH
destination = "local/grafana.ini"
}
env {
GF_INSTALL_PLUGINS = "grafana-clock-panel,grafana-simple-json-datasource"
}
vault {
policies = ["grafana"]
}
template {
data = <<EOH
{{with secret "secret/grafana"}}
GRAFANA_ADMIN_PASSWORD="{{.Data.admin_password}}"
{{end}}
EOH
destination = "secrets/grafana.env"
env = true
}
resources {
cpu = 200
memory = 512
}
}
}
}
Sécurité & Production
Configuration TLS/mTLS
# Configuration TLS complète
datacenter = "dc1"
data_dir = "/opt/nomad/data"
# TLS Configuration
tls {
http = true
rpc = true
ca_file = "/etc/nomad.d/certs/ca.pem"
cert_file = "/etc/nomad.d/certs/nomad.pem"
key_file = "/etc/nomad.d/certs/nomad-key.pem"
# Mutual TLS
verify_server_hostname = true
verify_https_client = true
# Strong ciphers only
tls_min_version = "tls12"
tls_cipher_suites = [
"TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305",
"TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384",
"TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256"
]
}
# ACLs Configuration
acl {
enabled = true
token_ttl = "30s"
policy_ttl = "60s"
# Bootstrap token (à changer après setup)
# nomad acl bootstrap
}
# Audit logging
audit {
enabled = true
sink "file" {
type = "file"
format = "json"
path = "/var/log/nomad/audit.log"
delivery_guarantee = "enforced"
rotate_bytes = 100000000 # 100MB
rotate_max_files = 10
}
}
Políticas ACL
# Policy développeur
namespace "dev" {
policy = "write"
# Limitations
capabilities = ["submit-job", "dispatch-job", "read-logs"]
}
namespace "prod" {
policy = "deny"
}
node {
policy = "read"
}
agent {
policy = "read"
}
# Policy admin
namespace "*" {
policy = "write"
capabilities = ["*"]
}
node {
policy = "write"
}
agent {
policy = "write"
}
operator {
policy = "write"
}
quota {
policy = "write"
}
plugin {
policy = "write"
}
Retour d'Expérience Production
Métriques Réelles
Cluster Production : - Taille : 50 nodes (3 servers + 47 clients) - Workloads : 200+ services, 50+ batch jobs - Uptime : 99.95% sur 18 mois - MTTR : 2.3 minutes en moyenne - Déploiements : 150+ par semaine
Comparaison avant/après migration K8s → Nomad :
Métrique | Kubernetes | Nomad | Amélioration |
---|---|---|---|
Temps setup cluster | 2-3 jours | 2-3 heures | 90% |
Temps déploiement | 5-8 minutes | 30-60 secondes | 80% |
Consommation RAM | 4-6 GB overhead | 500 MB overhead | 85% |
Complexité config | 200+ lignes YAML | 50 lignes HCL | 75% |
Learning curve | 6 mois | 2 semaines | 90% |
Patterns & Anti-Patterns
✅ Bonnes Pratiques :
- Jobs déclaratifs : Tout en HCL versionné Git
- Secrets externalisés : Intégration Vault systématique
- Health checks multiples : HTTP + TCP + custom
- Resource constraints : Toujours définir CPU/Memory
- Gradual rollouts : Canary + auto-revert
❌ Anti-Patterns à éviter :
- Raw exec driver : Risque sécurité élevé
- Secrets hardcodés : Dans templates ou configs
- Jobs sans resource limits : Monopolisation ressources
- Update strategy trop agressive : Downtime evitable
- Monitoring insuffisant : Alerting sur métriques basiques uniquement
Cas d'Usage Optimal
Nomad excelle pour : - Workloads mixtes : Conteneurs + VMs + binaires - Équipes moyennes : 5-50 développeurs - Multi-cloud/hybrid : Déploiement uniforme - Edge computing : Ressources limitées - Simplicité d'exploitation : Équipe ops réduite
Kubernetes reste meilleur pour : - Écosystème riche : Operators, Helm charts - Applications cloud-native : 12-factor apps - Grandes équipes : 100+ développeurs - Besoins spécialisés : Service mesh complexe, CRDs
Conclusion
HashiCorp Nomad représente une alternative solide et pragmatique à Kubernetes pour de nombreux cas d'usage. Sa simplicité d'installation, configuration et maintenance en font un choix judicieux pour les équipes qui privilégient l'efficacité opérationnelle à l'exhaustivité fonctionnelle.
Recommandations :
- Évaluez vos besoins réels : Nomad vs K8s selon taille équipe et complexité
- Commencez simple : Cluster dev en 30 minutes avec mode
-dev
- Intégrez la stack HashiCorp : Consul + Vault + Terraform
- Investissez dans l'observabilité : Métriques, logs, alerting
- Automatisez la sécurité : TLS, ACLs, rotation secrets
Nomad vous permet de vous concentrer sur vos applications plutôt que sur l'orchestrateur - et c'est précisément sa plus grande force.