Compare commits

...

58 Commits

Author SHA1 Message Date
Aleksandr Tcitlionok
f16a7cd995 fix(ci): latest build-push-action
All checks were successful
CI Pipeline / Build and Push Docker Image (push) Successful in 4m44s
2024-12-11 11:01:37 +00:00
Aleksandr Tcitlionok
8e176dc670 fix(ci): use vars
Some checks failed
CI Pipeline / Build and Push Docker Image (push) Failing after 2m40s
2024-12-11 10:41:58 +00:00
Aleksandr Tcitlionok
ca2826785d fix(ci): use correct image name
Some checks failed
CI Pipeline / Build and Push Docker Image (push) Failing after 27s
2024-12-11 10:31:31 +00:00
Aleksandr Tcitlionok
5c5627ef5e fix(ci): use token
Some checks failed
CI Pipeline / Build and Push Docker Image (push) Failing after 18s
2024-12-11 10:28:16 +00:00
Aleksandr Tcitlionok
98c7e13914 fix(ci): use gitea instead
Some checks failed
CI Pipeline / Build and Push Docker Image (push) Failing after 20s
2024-12-11 10:23:17 +00:00
Aleksandr Tcitlionok
87ab5bbc2a fix(ci): update
Some checks failed
CI Pipeline / Build and Push Docker Image (push) Failing after 28s
2024-12-11 10:16:36 +00:00
Aleksandr Tcitlionok
14d3f51029 fix(ci): use different var
Some checks failed
CI Pipeline / Build and Push Docker Image (push) Failing after 18s
2024-12-11 10:10:07 +00:00
Aleksandr Tcitlionok
7c8ebc4bcd debug(ci): registry check
Some checks failed
CI Pipeline / Build and Push Docker Image (push) Failing after 19s
2024-12-11 09:49:44 +00:00
Aleksandr Tcitlionok
5b5dae998c fix(ci): use vars instead of secrets
Some checks failed
CI Pipeline / Build and Push Docker Image (push) Failing after 21s
2024-12-11 09:46:23 +00:00
Aleksandr Tcitlionok
f3f6c8fd37 fix(ci): typo
Some checks failed
CI Pipeline / Build and Push Docker Image (push) Has been cancelled
2024-12-11 09:41:59 +00:00
Aleksandr Tcitlionok
d28a87a02c fix(ci): compatibility with gitea
Some checks failed
CI Pipeline / Build and Push Docker Image (push) Failing after 17s
2024-12-11 09:39:05 +00:00
Aleksandr Tcitlionok
acef6cb9a1 update(app): add helm chart and cleanup repo
Some checks failed
CI Pipeline / Build and Push Docker Image (push) Failing after 2m48s
2024-12-11 09:05:15 +00:00
Aleksandr Tcitlionok
c12ce697c0 update(doc): README.md 2024-12-10 04:21:41 +00:00
Aleksandr Tcitlionok
a663d02c3c update(app): freeze modules 2024-12-09 03:45:17 +00:00
Aleksandr Tcitlionok
f7a72612d9 update(doc): README.md and examples 2024-12-09 03:38:25 +00:00
Aleksandr Tcitlionok
dc66d5a33e fix(vm): change method from post to get 2024-12-09 03:26:29 +00:00
Aleksandr Tcitlionok
11ad665173 update(vm): use token ENV and skip dup 2024-12-09 03:21:25 +00:00
Aleksandr Tcitlionok
4242a35cf3 fix(vm): correctly calculate tz 2024-12-09 03:15:30 +00:00
Aleksandr Tcitlionok
6e80ea7773 fix(vm): use JSON instead 2024-12-09 03:11:45 +00:00
Aleksandr Tcitlionok
24ceca9e4f update(vm): remove pseudographic and add hetzner api calls 2024-12-09 03:06:37 +00:00
Aleksandr Tcitlionok
5d7ec5c363 fix vm 2024-12-06 08:14:17 +00:00
Aleksandr Tcitlionok
0e82320710 fix 2024-12-06 08:11:20 +00:00
Aleksandr Tcitlionok
418e9794be fix(think): use staging 2024-12-06 08:07:39 +00:00
Aleksandr Tcitlionok
5a1d611f57 fix(data): add missing vm_type 2024-12-06 06:42:19 +00:00
Aleksandr Tcitlionok
9ac8eb9830 fix(data): use vm_type for vms 2024-12-06 06:39:52 +00:00
Aleksandr Tcitlionok
f22781e04b update(metal): preload data 2024-12-06 06:36:57 +00:00
Aleksandr Tcitlionok
3ef8b2a4a8 update(ui): add time on duty for metal 2024-12-06 04:10:14 +00:00
Aleksandr Tcitlionok
956ad10e1c fix(ui): format 2024-12-06 04:06:05 +00:00
Aleksandr Tcitlionok
cf2a223a35 fix(ui): return namespaces 2024-12-06 04:03:52 +00:00
Aleksandr Tcitlionok
ebecb19602 fix(ui): add machines back 2024-12-06 04:00:02 +00:00
Aleksandr Tcitlionok
dd37967bf0 update(k8s): use time_on_duty in export and ui 2024-12-06 03:57:37 +00:00
Aleksandr Tcitlionok
955d7d3e55 update(k8s): add granularity 2024-12-06 03:51:48 +00:00
Aleksandr Tcitlionok
22ea154c19 update(k8s): add time on duty 2024-12-06 03:48:35 +00:00
Aleksandr Tcitlionok
1c130a9271 add(pseudo): ai summary 2024-12-05 08:45:53 +00:00
Aleksandr Tcitlionok
eaeb088204 fix(think): revert 2024-12-05 08:35:00 +00:00
Aleksandr Tcitlionok
f0c1458cde fix(think): correctly get list 2024-12-05 08:33:48 +00:00
Aleksandr Tcitlionok
f0617da839 fix(main): import think 2024-12-05 08:31:37 +00:00
Aleksandr Tcitlionok
f57e6028f0 add(think): enable think router 2024-12-05 08:29:43 +00:00
Aleksandr Tcitlionok
3ea939b7d6 update(think): update route with storage and type 2024-12-05 08:19:31 +00:00
Aleksandr Tcitlionok
289b5bb5d3 fix(pseudo): add missing conversion 2024-12-05 08:12:58 +00:00
Aleksandr Tcitlionok
dfc26d1beb fix(k8s): use ephemeral 2024-12-05 08:10:14 +00:00
Aleksandr Tcitlionok
0ec23d8122 update(k8s): add storage and instance type 2024-12-05 08:05:15 +00:00
Aleksandr Tcitlionok
55dd06584c fix(pseudo): color 2024-12-05 07:47:30 +00:00
Aleksandr Tcitlionok
633c4285d8 add(pseudo): colors 2024-12-05 07:42:41 +00:00
Aleksandr Tcitlionok
182b9c8a2b update(pseudo): add namespace and conversions 2024-12-05 07:34:47 +00:00
Aleksandr Tcitlionok
235d23d5a7 fix(export): show kib in mib 2024-12-05 07:31:52 +00:00
Aleksandr Tcitlionok
03a0c10c64 fix(k8s): convert kib to mib 2024-12-05 07:28:53 +00:00
Aleksandr Tcitlionok
38dc985545 fix(k8s): round core count 2024-12-05 07:22:32 +00:00
Aleksandr Tcitlionok
62ad85fa46 fix(k8s): conversion fix 2024-12-05 07:16:18 +00:00
Aleksandr Tcitlionok
28f6198428 fix(k8s): adapt export functions 2024-12-05 07:12:42 +00:00
Aleksandr Tcitlionok
092ac307ed fix(k8s): serialization 2024-12-05 07:00:03 +00:00
Aleksandr Tcitlionok
1db862e770 update(k8s): add namespace metrics 2024-12-05 06:56:23 +00:00
Aleksandr Tcitlionok
fb4b63270b update(k8s): calculate namespace usage 2024-12-05 06:51:57 +00:00
Aleksandr Tcitlionok
b9e4fdbf7a fix(export): use content type for YAML export 2024-12-05 06:46:54 +00:00
Aleksandr Tcitlionok
fd984b302a update(k8s): pseudo - convert kib to gib 2024-12-05 06:41:17 +00:00
Aleksandr Tcitlionok
e986230d73 fix(k8s): add missing config import 2024-12-05 06:37:04 +00:00
Aleksandr Tcitlionok
7c7cf79ca8 fix(app): use CBR and correct sa 2024-12-05 06:28:31 +00:00
Aleksandr Tcitlionok
aa17c7a2dc update(app): add examples and README file 2024-12-05 03:06:01 +00:00
26 changed files with 952 additions and 113 deletions

View File

@@ -1 +1,2 @@
app/resources.db
.env

46
.gitea/workflows/ci.yaml Normal file
View File

@@ -0,0 +1,46 @@
name: CI Pipeline
on:
push:
branches:
- main
pull_request:
branches:
- main
env:
IMAGE_NAME: terghalin/metalcheck
jobs:
build-and-push:
name: Build and Push Docker Image
runs-on: ubuntu-latest
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Log in to the Container registry
uses: docker/login-action@v3
with:
registry: ${{ vars.IMAGE_REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.CI_TOKEN }}
- name: Cache Python dependencies
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- name: Build and Push Docker Image
uses: docker/build-push-action@v6
with:
push: true
tags: |
${{ vars.IMAGE_REGISTRY }}/${{ env.IMAGE_NAME }}:latest

150
README.md
View File

@@ -1,3 +1,149 @@
# Metal Check
# MetalCheck Backend
Metal Check - Get tips and information about your Kubernetes and on-premise servers.
MetalCheck is a backend service that provides insights into virtual machines, physical nodes, and Kubernetes clusters.
It supports deployment in an EKS environment and offers features like data aggregation, import from APIs, and data export.
---
## Features
- **Metal Nodes**: Track hardware details like CPU, memory, and storage for physical nodes. Import node data from JSON files.
- **Virtual Machines**: Track virtual machine details like CPU, memory, and storage. Import VM data from Hetzner Cloud API or JSON files.
- **Kubernetes Clusters**: Query Kubernetes clusters to gather node and namespace data.
- **Data Export**: Export collected data in JSON or YAML format for further processing or visualization.
---
## Project Structure
```plaintext
metalcheck-backend/
├── app/
│ ├── __init__.py # Initialization
│ ├── main.py # FastAPI entry point
│ ├── database.py # SQLite DB setup and operations
│ ├── routes/
│ │ ├── __init__.py # Initialization for routes
│ │ ├── metal.py # Routes for metal nodes
│ │ ├── vm.py # Routes for virtual machines
│ │ ├── k8s.py # Routes for Kubernetes clusters
│ │ ├── think.py # Routes for AI summary
│ │ ├── export.py # Data export routes
│ ├── examples/k8s/
│ │ ├── rbac.yaml # RBAC configuration for Kubernetes
│ │ ├── deployment.yaml # Deployment configuration for EKS
│ │ ├── configmap.yaml # ConfigMap for EKS
├── Dockerfile # Docker container configuration
├── requirements.txt # Python dependencies
├── .dockerignore # Files to ignore during image build
└── README.md # Project documentation
```
# Setup and Deployment
## Prerequisites
- Python 3.10+
- Docker and kubectl installed
- Access to an EKS cluster
- AWS CLI configured with appropriate permissions
## Build and Deploy
### Build and Push Docker Image
```bash
docker build -t metalcheck-backend .
docker tag metalcheck-backend:latest <your-ecr-repo>:latest
docker push <your-ecr-repo>:latest
```
### Deploy to EKS
Apply RBAC and deployment configurations:
```bash
kubectl apply -f examples/k8s/rbac.yaml
kubectl apply -f examples/k8s/configmap.yaml
kubectl apply -f examples/k8s/deployment.yaml
```
### Access the Service
Retrieve the LoadBalancer IP:
```bash
kubectl get svc -n metalcheck
```
Test the API:
```bash
curl http://<EXTERNAL-IP>/k8s/data
```
## Kubernetes Integration
The `/k8s/data` endpoint retrieves information about:
- Nodes: CPU, memory, and allocatable pods.
- Namespaces: List of all namespaces in the cluster.
## Endpoints
| Method | Endpoint | Description |
| --- | --- | --- |
| GET | /metal/data | Get all physical node data |
| POST | /metal/data | Add a new physical node |
| GET | /vm/data | Get all virtual machine data |
| GET | /vm/import_hetzner | Import VM data from Hetzner API |
| POST | /vm/data | Add a new virtual machine |
| GET | /k8s/data | Get Kubernetes cluster information |
| GET | /think/k8s | Get AI summary for Kubernetes cluster |
| GET | /export | Export data in JSON or YAML format |
## Hetzner Integration
To fetch virtual machines from Hetzner Cloud:
1. Ensure your Hetzner API token is available as an environment variable:
```shell
export HETZNER_TOKEN=<your-token>
```
2. Use the /vm/import-hetzner endpoint:
```shell
curl http://<EXTERNAL-IP>/vm/import_hetzner
```
## Data Export
1. Export in YAML format:
```shell
curl -X GET "http://<EXTERNAL-IP>/export?format=yaml"
```
2. Export in JSON format:
```shell
curl -X GET "http://<EXTERNAL-IP>/export?format=json"
```
## MetalCheck Helm Chart
This Helm chart deploys the MetalCheck backend on a Kubernetes cluster.
### Installation
#### Prerequisites
- Kubernetes cluster (1.18+ recommended)
- Helm 3.x installed
- Docker image for MetalCheck backend
#### Install Chart
1. Update `values.yaml` with your specific configurations.
2. Install the chart:
```bash
helm install metalcheck ./metalcheck-helm

32
app/data/metal.json Normal file
View File

@@ -0,0 +1,32 @@
[
{
"name": "chuo",
"location": "Tokyo",
"vendor": "The Company",
"cpu": 24,
"memory": "64GB",
"storage": "1TB",
"time_on_duty": 522,
"initial_cost": 1900
},
{
"name": "kyoto",
"location": "Kyoto",
"vendor": "The Company",
"cpu": 36,
"memory": "96GB",
"storage": "1TB",
"time_on_duty": 807,
"initial_cost": 1889
},
{
"name": "sa-100",
"location": "Sapporo",
"vendor": "The Company",
"cpu": 8,
"memory": "16GB",
"storage": "1TB",
"time_on_duty": 4401,
"initial_cost": 849
}
]

26
app/data/vm.json Normal file
View File

@@ -0,0 +1,26 @@
[
{
"name": "cax-41-1",
"location": "Helsinki",
"cpu": 16,
"memory": "32GB",
"storage": "320GB SSD",
"vm_type": "cax-2xlarge"
},
{
"name": "cax-41-2",
"location": "Helsinki",
"cpu": 16,
"memory": "32GB",
"storage": "320GB SSD",
"vm_type": "cax-2xlarge"
},
{
"name": "ccx-63",
"location": "Germany",
"cpu": 48,
"memory": "192GB",
"storage": "960GB SSD",
"vm_type": "cx-6xlarge"
}
]

View File

@@ -1,70 +0,0 @@
# Show pseudo-graphic tables using rich library, if there is no frontend
from rich.console import Console
from rich.table import Table
from database import fetch_all
def display_metal_nodes():
console = Console()
table = Table(title="Metal Nodes")
table.add_column("ID", justify="right")
table.add_column("Name")
table.add_column("Location")
table.add_column("Vendor")
table.add_column("CPU", justify="right")
table.add_column("Memory")
table.add_column("Storage")
nodes = fetch_all("metal_nodes")
for node in nodes:
table.add_row(
str(node[0]), node[1], node[2], node[3],
str(node[4]), node[5], node[6]
)
console.print(table)
def display_virtual_machines():
console = Console()
table = Table(title="Virtual Machines")
table.add_column("ID", justify="right")
table.add_column("Name")
table.add_column("Location")
table.add_column("CPU", justify="right")
table.add_column("Memory")
table.add_column("Storage")
table.add_column("Type")
vms = fetch_all("virtual_machines")
for vm in vms:
table.add_row(
str(vm[0]), vm[1], vm[2],
str(vm[3]), vm[4], vm[5], vm[6]
)
console.print(table)
def display_kubernetes_nodes():
console = Console()
table = Table(title="Kubernetes Nodes")
table.add_column("ID", justify="right")
table.add_column("Cluster Name")
table.add_column("Node Name")
table.add_column("CPU", justify="right")
table.add_column("Memory")
table.add_column("Storage")
table.add_column("Type")
table.add_column("Namespaces")
nodes = fetch_all("kubernetes_nodes")
for node in nodes:
table.add_row(
str(node[0]), node[1], node[2],
str(node[3]), node[4], node[5],
node[6], node[7]
)
console.print(table)
if __name__ == "__main__":
display_metal_nodes()
display_virtual_machines()

View File

@@ -1,16 +1,83 @@
import json
import os
import logging
from fastapi import FastAPI
from database import init_db
from routes import metal, vm, k8s, export
from database import init_db, insert_metal_node, insert_virtual_machine
from fastapi.middleware.cors import CORSMiddleware
from routes import metal, vm, k8s, export, think
from pydantic import BaseModel, ValidationError
class MetalNode(BaseModel):
name: str
location: str
vendor: str
cpu: int
memory: str
storage: str
time_on_duty: int
initial_cost: float
class VirtualMachine(BaseModel):
name: str
location: str
cpu: int
memory: str
storage: str
vm_type: str
app = FastAPI()
init_db()
METAL_JSON = "data/metal.json"
VM_JSON = "data/vm.json"
logger = logging.getLogger("uvicorn")
@app.on_event("startup")
async def startup_event():
init_db()
if os.path.exists(METAL_JSON):
with open(METAL_JSON, "r") as file:
metal_nodes = json.load(file)
for node in metal_nodes:
try:
validated_node = MetalNode(**node)
insert_metal_node(
name=validated_node.name,
location=validated_node.location,
vendor=validated_node.vendor,
cpu=validated_node.cpu,
memory=validated_node.memory,
storage=validated_node.storage,
time_on_duty=validated_node.time_on_duty,
initial_cost=validated_node.initial_cost
)
except ValidationError as e:
logger.error(f"Invalid metal node data: {e}")
if os.path.exists(VM_JSON):
with open(VM_JSON, "r") as file:
vms = json.load(file)
for vm in vms:
try:
validated_vm = VirtualMachine(**vm)
insert_virtual_machine(
name=validated_vm.name,
location=validated_vm.location,
cpu=validated_vm.cpu,
memory=validated_vm.memory,
storage=validated_vm.storage,
vm_type=validated_vm.vm_type
)
except ValidationError as e:
logger.error(f"Invalid VM data: {e}")
# Include routes
app.include_router(metal.router)
app.include_router(vm.router)
app.include_router(k8s.router)
app.include_router(export.router)
app.include_router(think.router)
@app.get("/")
def root():

View File

@@ -1,6 +1,5 @@
# Here be dragons
fastapi
uvicorn
pyyaml
rich
kubernetes
fastapi==0.99.1
uvicorn==0.23.2
PyYAML==6.0.2
rich==13.9.4
kubernetes==31.0.0

View File

@@ -1,19 +1,131 @@
# Export data in YAML or JSON format
from fastapi import APIRouter
from fastapi import APIRouter, Response
from database import fetch_all
from kubernetes import client, config
from datetime import datetime, timezone
import yaml
import json
router = APIRouter()
def calculate_time_on_duty(creation_timestamp):
"""
Calculate the time on duty in hours, days, or minutes from the creation timestamp.
"""
now = datetime.now(timezone.utc)
delta = now - creation_timestamp
# If less than an hour, return minutes
if delta.days < 1 and delta.seconds < 3600:
minutes = delta.seconds // 60
return f"{minutes} minutes" if minutes > 1 else "less than a minute"
# If less than a day, return hours
if delta.days < 1:
hours = delta.seconds // 3600
return f"{hours} hours" if hours > 1 else "1 hour"
return f"{delta.days} days" if delta.days > 1 else "1 day"
def convert_memory_to_gb(memory):
"""
Convert memory to GB (gigabytes) for ephemeral-storage.
"""
if "Ki" in memory:
return int(memory.replace("Ki", "")) / (1024 ** 2)
elif "Mi" in memory:
return int(memory.replace("Mi", "")) / 1024
elif "Gi" in memory:
return int(memory.replace("Gi", ""))
return float(memory)
def convert_memory_to_mib(memory):
"""
Convert memory to MiB (mebibytes).
"""
if "Ki" in memory:
return int(memory.replace("Ki", "")) / 1024
elif "Mi" in memory:
return int(memory.replace("Mi", ""))
elif "Gi" in memory:
return int(memory.replace("Gi", "")) * 1024
return float(memory)
def convert_cpu_to_cores(cpu):
"""
Convert CPU usage to cores for human-readable format.
Handles units: n (nano), u (micro), m (milli), or none (cores).
Returns float values for cores, rounded appropriately.
"""
if "n" in cpu:
return round(int(cpu.replace("n", "")) / 1e9, 4)
elif "u" in cpu:
return round(int(cpu.replace("u", "")) / 1e6, 4)
elif "m" in cpu:
return round(int(cpu.replace("m", "")) / 1000, 4)
return float(cpu)
def fetch_k8s_data_with_usage():
config.load_incluster_config()
v1 = client.CoreV1Api()
metrics_client = client.CustomObjectsApi()
nodes = []
for node in v1.list_node().items:
# Extract storage (ephemeral-storage) and instance type
ephemeral_storage = node.status.capacity.get("ephemeral-storage", "0")
instance_type = node.metadata.labels.get("beta.kubernetes.io/instance-type", "N/A")
creation_timestamp = node.metadata.creation_timestamp
time_on_duty = calculate_time_on_duty(creation_timestamp) if creation_timestamp else "N/A"
nodes.append({
"node_name": node.metadata.name,
"cpu": node.status.capacity.get("cpu"),
"memory": round(convert_memory_to_mib(node.status.capacity.get("memory")), 2), # Convert to MiB
"storage": f"{round(convert_memory_to_gb(ephemeral_storage), 2)} GB",
"instance_type": instance_type,
"pods_allocatable": node.status.allocatable.get("pods"),
"time_on_duty": time_on_duty, # Add time on duty
})
namespaces = [ns.metadata.name for ns in v1.list_namespace().items]
namespace_usage = {}
pod_metrics = metrics_client.list_cluster_custom_object(
group="metrics.k8s.io", version="v1beta1", plural="pods"
)
for pod in pod_metrics["items"]:
pod_namespace = pod["metadata"]["namespace"]
if pod_namespace not in namespace_usage:
namespace_usage[pod_namespace] = {"cpu": 0, "memory": 0}
for container in pod["containers"]:
cpu_usage = container["usage"]["cpu"]
memory_usage = container["usage"]["memory"]
namespace_usage[pod_namespace]["cpu"] += convert_cpu_to_cores(cpu_usage)
namespace_usage[pod_namespace]["memory"] += convert_memory_to_mib(memory_usage)
namespace_usage = {
ns: {
"cpu": round(usage["cpu"], 4),
"memory": round(usage["memory"], 2),
}
for ns, usage in namespace_usage.items()
}
return {"nodes": nodes, "namespaces": namespaces, "namespace_usage": namespace_usage}
@router.get("/export")
def export_data(format: str = "yaml"):
data = {
"metal_nodes": fetch_all("metal_nodes"),
"virtual_machines": fetch_all("virtual_machines"),
"kubernetes_nodes": fetch_all("kubernetes_nodes"),
"kubernetes": fetch_k8s_data_with_usage(),
}
# Return data in the requested format
if format.lower() == "yaml":
return yaml.safe_dump(data)
return json.dumps(data, indent=2)
yaml_data = yaml.safe_dump(data, sort_keys=False)
return Response(content=yaml_data, media_type="text/yaml")
return Response(content=json.dumps(data, indent=2), media_type="application/json")

View File

@@ -1,33 +1,113 @@
from fastapi import APIRouter
from kubernetes import client, config
from datetime import datetime, timezone
router = APIRouter()
# Initialize Kubernetes client
def init_k8s_client():
try:
config.load_incluster_config()
except:
config.load_kube_config() # local testing
def fetch_k8s_data_with_usage():
config.load_incluster_config()
v1 = client.CoreV1Api()
metrics_client = client.CustomObjectsApi()
nodes = []
for node in v1.list_node().items:
# Extract storage (ephemeral-storage) and instance type
ephemeral_storage = node.status.capacity.get("ephemeral-storage", "0")
instance_type = node.metadata.labels.get("beta.kubernetes.io/instance-type", "N/A")
creation_timestamp = node.metadata.creation_timestamp
if creation_timestamp:
time_on_duty = calculate_time_on_duty(creation_timestamp)
else:
time_on_duty = "N/A"
nodes.append({
"node_name": node.metadata.name,
"cpu": node.status.capacity.get("cpu"),
"memory": round(convert_memory_to_mib(node.status.capacity.get("memory")), 2),
"storage": f"{round(convert_memory_to_gb(ephemeral_storage), 2)} GB",
"instance_type": instance_type,
"pods_allocatable": node.status.allocatable.get("pods"),
"time_on_duty": time_on_duty,
})
namespaces = [ns.metadata.name for ns in v1.list_namespace().items]
namespace_usage = {}
pod_metrics = metrics_client.list_cluster_custom_object(
group="metrics.k8s.io", version="v1beta1", plural="pods"
)
for pod in pod_metrics["items"]:
pod_namespace = pod["metadata"]["namespace"]
if pod_namespace not in namespace_usage:
namespace_usage[pod_namespace] = {"cpu": 0, "memory": 0}
for container in pod["containers"]:
cpu_usage = container["usage"]["cpu"]
memory_usage = container["usage"]["memory"]
namespace_usage[pod_namespace]["cpu"] += convert_cpu_to_cores(cpu_usage)
namespace_usage[pod_namespace]["memory"] += convert_memory_to_mib(memory_usage)
namespace_usage = {
ns: {
"cpu": round(usage["cpu"], 4),
"memory": round(usage["memory"], 2),
}
for ns, usage in namespace_usage.items()
}
return {"nodes": nodes, "namespaces": namespaces, "namespace_usage": namespace_usage}
def calculate_time_on_duty(creation_timestamp):
"""
Calculate the time on duty in hours, days, or minutes from the creation timestamp.
"""
now = datetime.now(timezone.utc)
delta = now - creation_timestamp
if delta.days < 1 and delta.seconds < 3600:
minutes = delta.seconds // 60
return f"{minutes} minutes" if minutes > 1 else "less than a minute"
if delta.days < 1:
hours = delta.seconds // 3600
return f"{hours} hours" if hours > 1 else "1 hour"
return f"{delta.days} days" if delta.days > 1 else "1 day"
def convert_memory_to_gb(memory):
"""
Convert memory to GB (gigabytes) for ephemeral-storage.
"""
if "Ki" in memory:
return int(memory.replace("Ki", "")) / (1024 ** 2)
elif "Mi" in memory:
return int(memory.replace("Mi", "")) / 1024
elif "Gi" in memory:
return int(memory.replace("Gi", ""))
return float(memory)
def convert_cpu_to_cores(cpu):
if "n" in cpu:
return round(int(cpu.replace("n", "")) / 1e9, 4)
elif "u" in cpu:
return round(int(cpu.replace("u", "")) / 1e6, 4)
elif "m" in cpu:
return round(int(cpu.replace("m", "")) / 1000, 4)
return float(cpu)
def convert_memory_to_mib(memory):
if "Ki" in memory:
return int(memory.replace("Ki", "")) / 1024
elif "Mi" in memory:
return int(memory.replace("Mi", ""))
elif "Gi" in memory:
return int(memory.replace("Gi", "")) * 1024
return float(memory)
@router.get("/k8s/data")
def get_k8s_data():
init_k8s_client()
v1 = client.CoreV1Api()
# Fetch nodes
nodes = v1.list_node()
node_data = []
for node in nodes.items:
node_data.append({
"node_name": node.metadata.name,
"cpu": node.status.capacity.get("cpu"),
"memory": node.status.capacity.get("memory"),
"pods_allocatable": node.status.allocatable.get("pods"),
})
# Fetch namespaces
namespaces = v1.list_namespace()
namespace_data = [ns.metadata.name for ns in namespaces.items]
return {"nodes": node_data, "namespaces": namespace_data}
return fetch_k8s_data_with_usage()

73
app/routes/think.py Normal file
View File

@@ -0,0 +1,73 @@
from fastapi import APIRouter, HTTPException
from kubernetes import client, config
from typing import List, Dict
import requests
import os
router = APIRouter()
def fetch_ai_summary(cluster_name: str, nodes: List[Dict], api_url: str, auth_token: str) -> str:
"""
Sends node data to an external AI API endpoint and retrieves a summary response.
"""
payload = {
"data": {
"cluster_name": cluster_name,
"nodes": nodes
}
}
headers = {
"Authorization": f"Bearer {auth_token}",
"Content-Type": "application/json"
}
try:
response = requests.post(api_url, json=payload, headers=headers)
if response.status_code == 200:
result = response.json()
if result.get("success"):
return result.get("result", "No result provided by the API.")
else:
return "API responded with success=false."
else:
return f"API request failed with status code {response.status_code}: {response.text}"
except requests.RequestException as e:
return f"An error occurred while contacting the API: {str(e)}"
@router.get("/think/k8s")
def think_k8s():
"""
Fetch Kubernetes data, send it to the AI API, and return the AI-generated summary.
"""
config.load_incluster_config()
v1 = client.CoreV1Api()
cluster_name = "eks-staging"
nodes = []
for node in v1.list_node().items:
# Fetch ephemeral-storage and instance type
ephemeral_storage = node.status.capacity.get("ephemeral-storage", "0")
instance_type = node.metadata.labels.get("beta.kubernetes.io/instance-type", "N/A")
nodes.append({
"name": node.metadata.name,
"cpu": f"{node.status.capacity.get('cpu')} cores",
"memory": f"{round(int(node.status.capacity.get('memory').replace('Ki', '')) / 1024 / 1024, 2)} GB",
"storage": f"{round(int(ephemeral_storage.replace('Ki', '')) / (1024 ** 2), 2)} GB",
"type": instance_type,
"namespaces": [ns.metadata.name for ns in v1.list_namespace().items]
})
api_url = os.getenv("AI_API_URL")
auth_token = os.getenv("AI_API_TOKEN")
if not api_url or not auth_token:
raise HTTPException(status_code=500, detail="AI API URL or token is not set. Please set AI_API_URL and AI_API_TOKEN environment variables.")
# Call AI API
summary = fetch_ai_summary(cluster_name, nodes, api_url, auth_token)
return {"summary": summary}

View File

@@ -1,8 +1,13 @@
from fastapi import APIRouter
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel
from database import insert_virtual_machine, fetch_all
import requests
import os
from datetime import datetime, timezone
import logging
router = APIRouter()
logger = logging.getLogger(__name__)
class VirtualMachine(BaseModel):
name: str
@@ -18,6 +23,10 @@ def get_vm_data():
@router.post("/vm/data")
def add_vm_data(vm: VirtualMachine):
existing_vms = fetch_all("virtual_machines")
if any(existing_vm[1] == vm.name for existing_vm in existing_vms):
return {"message": f"Virtual machine '{vm.name}' already exists."}
insert_virtual_machine(
name=vm.name,
location=vm.location,
@@ -27,3 +36,58 @@ def add_vm_data(vm: VirtualMachine):
vm_type=vm.type
)
return {"message": f"Virtual machine '{vm.name}' added successfully."}
@router.get("/vm/import-hetzner")
def import_hetzner_vms(api_token: str = None):
"""
Import virtual machines from Hetzner Cloud.
"""
# Check for Hetzner token
api_token = api_token or os.getenv("HETZNER_TOKEN")
if not api_token:
logger.warning("HETZNER_TOKEN is not set. Unable to fetch Hetzner servers.")
return {"message": "HETZNER_TOKEN is not set. Please set it to fetch servers from Hetzner."}
hetzner_url = "https://api.hetzner.cloud/v1/servers"
headers = {"Authorization": f"Bearer {api_token}"}
try:
response = requests.get(hetzner_url, headers=headers)
response.raise_for_status()
except requests.RequestException as e:
raise HTTPException(status_code=500, detail=f"Failed to fetch data from Hetzner API: {e}")
servers = response.json().get("servers", [])
if not servers:
return {"message": "No servers found on Hetzner Cloud."}
# Fetch existing VMs from the database
existing_vms = {vm[1] for vm in fetch_all("virtual_machines")}
imported_count = 0
for server in servers:
try:
name = server["name"]
if name in existing_vms:
logger.info(f"Skipping existing virtual machine: {name}")
continue
location = server["datacenter"]["location"]["name"]
cpu = server["server_type"]["cores"]
memory = f"{server['server_type']['memory']} GB"
storage = f"{server['server_type']['disk']} GB"
vm_type = server["server_type"]["name"]
insert_virtual_machine(
name=name,
location=location,
cpu=cpu,
memory=memory,
storage=storage,
vm_type=vm_type
)
imported_count += 1
except Exception as e:
logger.error(f"Failed to import server {server['name']}: {e}")
return {"message": f"Imported {imported_count} new virtual machines from Hetzner."}

View File

@@ -0,0 +1,9 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: api-config
namespace: metalcheck
data:
AI_API_URL: ""
AI_API_TOKEN: ""
HETZNER_TOKEN: ""

View File

@@ -0,0 +1,30 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: metalcheck-backend
labels:
app: metalcheck
spec:
replicas: 1
selector:
matchLabels:
app: metalcheck
template:
metadata:
labels:
app: metalcheck
spec:
serviceAccountName: metalcheck-sa
containers:
- name: backend
image: <your-ecr-repo>:latest
envFrom:
- configMapRef:
name: api-config
ports:
- containerPort: 8000
env:
- name: KUBERNETES_SERVICE_HOST
value: "kubernetes.default"
- name: KUBERNETES_SERVICE_PORT
value: "443"

49
examples/k8s/rbac.yaml Normal file
View File

@@ -0,0 +1,49 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: metalcheck-sa
namespace: metalcheck
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: metalcheck-clusterrole
rules:
- apiGroups: [""]
resources: ["pods", "nodes", "namespaces"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: metalcheck-clusterrolebinding
subjects:
- kind: ServiceAccount
name: metalcheck-sa
namespace: metalcheck
roleRef:
kind: ClusterRole
name: metalcheck-clusterrole
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: metrics-reader
rules:
- apiGroups: ["metrics.k8s.io"]
resources: ["pods"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: metrics-reader-binding
subjects:
- kind: ServiceAccount
name: metalcheck-sa
namespace: metalcheck
roleRef:
kind: ClusterRole
name: metrics-reader
apiGroup: rbac.authorization.k8s.io

6
helm/Chart.yaml Normal file
View File

@@ -0,0 +1,6 @@
apiVersion: v2
name: metalcheck-backend
description: A Helm chart for deploying the MetalCheck backend application.
type: application
version: 0.1.0
appVersion: "latest"

View File

@@ -0,0 +1,7 @@
{{/*
Expand common labels.
*/}}
{{- define "metalcheck-backend.labels" -}}
app: {{ .Chart.Name }}
release: {{ .Release.Name }}
{{- end }}

View File

@@ -0,0 +1,8 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ .Release.Name }}-config
labels:
app: {{ .Chart.Name }}
data:
HETZNER_TOKEN: {{ .Values.env.HETZNER_TOKEN | quote }}

View File

@@ -0,0 +1,30 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Release.Name }}-backend
labels:
app: {{ .Chart.Name }}
release: {{ .Release.Name }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
app: {{ .Chart.Name }}
release: {{ .Release.Name }}
template:
metadata:
labels:
app: {{ .Chart.Name }}
release: {{ .Release.Name }}
spec:
containers:
- name: backend
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: 80
env:
- name: HETZNER_TOKEN
value: "{{ .Values.env.HETZNER_TOKEN }}"
resources:
{{- toYaml .Values.resources | nindent 12 }}

View File

@@ -0,0 +1,26 @@
{{- if .Values.ingress.enabled }}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: {{ .Release.Name }}-ingress
annotations:
{{ toYaml .Values.ingress.annotations | indent 4 }}
spec:
rules:
{{- range .Values.ingress.hosts }}
- host: {{ .host }}
http:
paths:
{{- range .paths }}
- path: {{ .path }}
pathType: {{ .pathType }}
backend:
service:
name: {{ $.Release.Name }}-service
port:
number: {{ $.Values.service.port }}
{{- end }}
{{- end }}
tls:
{{ toYaml .Values.ingress.tls | indent 4 }}
{{- end }}

49
helm/templates/rbac.yaml Normal file
View File

@@ -0,0 +1,49 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ .Release.Name }}-sa
namespace: {{ .Release.Namespace }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ .Release.Name }}-clusterrole
rules:
- apiGroups: [""]
resources: ["pods", "nodes", "namespaces"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ .Release.Name }}-clusterrolebinding
subjects:
- kind: ServiceAccount
name: {{ .Release.Name }}-sa
namespace: {{ .Release.Namespace }}
roleRef:
kind: ClusterRole
name: {{ .Release.Name }}-clusterrole
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ .Release.Name }}-metrics-reader
rules:
- apiGroups: ["metrics.k8s.io"]
resources: ["pods"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ .Release.Name }}-metrics-reader-binding
subjects:
- kind: ServiceAccount
name: {{ .Release.Name }}-sa
namespace: {{ .Release.Namespace }}
roleRef:
kind: ClusterRole
name: {{ .Release.Name }}-metrics-reader
apiGroup: rbac.authorization.k8s.io

View File

@@ -0,0 +1,14 @@
apiVersion: v1
kind: Service
metadata:
name: {{ .Release.Name }}-service
labels:
app: {{ .Chart.Name }}
spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
targetPort: 80
selector:
app: {{ .Chart.Name }}
release: {{ .Release.Name }}

35
helm/values.yaml Normal file
View File

@@ -0,0 +1,35 @@
replicaCount: 2
image:
repository: ghcr.io/terghalin/metalcheck-backend
tag: latest
pullPolicy: IfNotPresent
service:
type: LoadBalancer
port: 80
ingress:
enabled: false
annotations: {}
hosts:
- host: metalcheck.local
paths:
- path: /
pathType: ImplementationSpecific
tls: []
resources:
limits:
cpu: "500m"
memory: "512Mi"
requests:
cpu: "250m"
memory: "256Mi"
env:
HETZNER_TOKEN: ""
nodeSelector: {}
tolerations: []
affinity: {}