The Backbone: Building the FastAPI Brain

Before SentinelAI could discover failing pods, score cluster health, or hand back an AI-generated root cause, it needed something far less glamorous: a backend that could simply respond when asked a question. Today's post is about that backbone — the FastAPI service that everything else in this project sits on top of.

I picked FastAPI for a few practical reasons, not just because it's trendy. It's built for Python, which means the same language could carry me from the API layer all the way to the AI layer later on without switching ecosystems. It validates incoming and outgoing data automatically through Pydantic, which catches a whole category of bugs before they ever reach a user. And it generates interactive API documentation for free, just from the code you write — which matters more than it sounds, because anyone reviewing this project, including a non-technical recruiter, can open the docs page and actually see what the system does without reading a line of code.

But the real decision I want to talk about isn't the framework. It's the five endpoints I started with, and the order I thought about them in.

/health answers one question only: is the process alive? It returns the app name, version, and environment, and nothing else. This sounds almost too simple to mention, but Kubernetes leans on exactly this kind of endpoint to decide whether to restart a container, so getting it right early mattered more than it looked.

/status goes one layer deeper. It tells you the service is not just alive, but actually running properly, including how long it's been up. The difference between health and status trips a lot of beginners up, so here's the distinction I used: health is "are you breathing," status is "are you doing your job." A service can be alive and still be stuck, and these are two different failure modes that deserve two different answers.

/metrics is the one I find most interesting in hindsight. From the very first version of this backend, this endpoint already spoke Prometheus's native format — counters and gauges, not plain JSON. At the time, I didn't have Prometheus or Grafana running anywhere yet; that whole observability stack was still a phase away. But I built this endpoint to already be fluent in the language that future tooling would speak. That's a small decision that paid for itself later, because by the time I actually wired up the monitoring stack (which I'll cover properly in a later post), there was nothing to retrofit. It just plugged in.

/alerts and /recommendation are where things get honest. In the very first version of this backend, these two endpoints didn't have any real intelligence behind them yet. They returned a defined shape — an alert with an ID, a severity, a message, a source, a timestamp; a recommendation with a confidence level and a generated-at timestamp — but the logic that filled those shapes with real, meaningful answers came much later, after the anomaly detector and the AI RCA engine existed. I built the question before I built the answer.

That might sound backwards, but it's actually a discipline I'd recommend to anyone building a system with eventual "smart" behaviour. Decide what shape the answer will take before you decide how you're going to generate it. If I'd waited until the AI engine existed to design the recommendation response, I would've been reshaping the entire API contract midway through the project, which tends to break every component depending on it. Designing the contract first meant the frontend, the alert service, and eventually the AI engine could all be built against a stable target instead of a moving one.

Underneath all five endpoints sits a layer most tutorials skip past: a typed settings system and structured logging, both small but worth explaining for anyone outside engineering reading this. The settings live in one place, with sensible defaults, and get overridden through environment variables rather than hardcoded values scattered through the code. This is the exact mechanism that lets the same codebase behave differently in development, staging, and production without a single line being rewritten — you'll see this idea come back when I cover the Kubernetes environments in a couple of days. And the logging isn't just print() statements. Every log line carries a timestamp, a severity level, the component it came from, and the message itself, written to standard output in a consistent format. It sounds minor until you're trying to debug a production issue at midnight and the only thing standing between you and the answer is whether your logs are structured or just noise.

If there's a theme to Day 2, it's this: before a system can be intelligent, it has to be honest about its own shape. I didn't build SentinelAI's brain on day one. I built the skeleton it would eventually hang off of — five clear contracts, validated data, consistent logs, and a configuration system that wouldn't need to be torn up later. Everything I built afterwards, including the AI engine I'm most proud of, only worked because this layer didn't have to change underneath it.

Tomorrow, I'll cover the part that turns this Python service into something that can actually run anywhere: Docker, and the slightly embarrassing story of how my first Dockerfile wasn't nearly as secure as I thought it was.