Building and Securing an API for LLM Access with FastAPI and Ollama

This lab demonstrate how to create a secure API using Python, FastAPI, and Ollama to control access to a Large Language Model (LLM) running locally.

Objective

In this lab, you will:

Build a secure API using FastAPI that interfaces with a locally run LLM via Ollama.
Implement API key-based authentication to control access.
Understand the importance of securing access to AI models.
Test the API using tools like Postman or curl.
Optionally, deploy the API to a cloud platform using Docker and set up a CI/CD pipeline.

Prerequisites

Before starting, ensure you have:

Basic Python programming knowledge: Familiarity with Python syntax and libraries.
Command-line proficiency: Ability to navigate and execute commands in a terminal.
Understanding of HTTP methods and APIs: Knowledge of GET, POST requests, and API concepts.
System requirements: A computer with sufficient hardware to run an LLM locally (e.g., 8GB+ RAM, depending on the model).

Lab Setup

Step 1: Install Python and pip

Ensure Python 3.x and pip are installed on your system.
Verify installation:
```
python3 --version
pip3 --version
```
If not installed, download from python.org.

Step 2: Install Required Libraries

Create a file named requirements.txt with the following content:
```
fastapi
uvicorn
ollama
python-dotenv
requests
```
Install the dependencies:
```
pip3 install -r requirements.txt
```
- fastapi: Framework for building the API.
- uvicorn: ASGI server to run the FastAPI app.
- ollama: Library to interface with the Ollama LLM.
- python-dotenv: For loading environment variables.
- requests: For testing the API from Python.

Step 3: Set Up Ollama

Download Ollama: Visit ollama.com/download and install it for your OS.
Pull an LLM model: Open a terminal and run:
```
ollama pull mistral
```
This downloads the Mistral model (a lightweight, open-source LLM).
Verify Ollama: Test it by running:
```
ollama run mistral
```
Type a prompt (e.g., “Hello World”) and confirm you get a response. Exit with /bye.

Lab Steps

Step 1: Create a Simple FastAPI Application

Create a Project Directory:
```
mkdir llm-api-lab
cd llm-api-lab
```
Initialize Version Control (DevOps Practice):
```
git init
```

Write the API Code:

Create a file named main.py:

from fastapi import FastAPI
import ollama

app = FastAPI()

@app.post("/generate")
def generate(prompt: str):
    response = ollama.chat(model="mistral", messages=[{"role": "user", "content": prompt}])
    return {"response": response["message"]["content"]}

This defines a POST endpoint /generate that accepts a prompt parameter and returns an LLM response.

Run the Application:
```
uvicorn main:app --reload
```
- main:app refers to the app object in main.py.
- --reload enables auto-reloading for development.

Commit Your Work:

git add main.py requirements.txt
git commit -m "Initial FastAPI app with /generate endpoint"

Step 2: Test the Unsecured API

Using curl:
```
curl -X POST "http://localhost:8000/generate?prompt=Hello%20World"
```
You should see a JSON response like {"response": "Hello World! ..."}.
Using Postman (Alternative):
- Download Postman from postman.com.
- Create a new request:
  - Method: POST
  - URL: http://localhost:8000/generate?prompt=Hello World
  - Click “Send” and verify the response.

Step 3: Secure the API with an API Key

Update main.py for Security:

from fastapi import FastAPI, Depends, HTTPException, Header
import ollama
import os
from dotenv import load_dotenv

app = FastAPI()

# Load environment variables
load_dotenv()
API_KEYS = {os.getenv("API_KEY")}

# Dependency to verify API key
def verify_api_key(x_api_key: str = Header(None)):
    if x_api_key not in API_KEYS:
        raise HTTPException(status_code=401, detail="Invalid API Key")
    return x_api_key

@app.post("/generate")
def generate(prompt: str, api_key: str = Depends(verify_api_key)):
    response = ollama.chat(model="mistral", messages=[{"role": "user", "content": prompt}])
    return {"response": response["message"]["content"]}

Create an .env File:
- In the project directory, create .env:
```
API_KEY=your_secret_key
```
- Replace your_secret_key with a secure key (e.g., mysecret123).

Add .env to .gitignore (DevOps Practice):

Create .gitignore:
```
.env
```

Commit changes:

git add main.py .gitignore
git commit -m "Added API key authentication"

Restart the Server:
```
uvicorn main:app --reload
```

Step 4: Test the Secured API

Test with Correct API Key:
- Using curl:
```
curl -X POST "http://localhost:8000/generate?prompt=Hello%20World" -H "x-api-key: your_secret_key"
```
  Replace your_secret_key with the value from .env. You should get a valid response.

Test with Incorrect/No API Key:

Without header:
```
curl -X POST "http://localhost:8000/generate?prompt=Hello%20World"
```
Expect a 401 Unauthorized error: {"detail": "Invalid API Key"}.

With wrong key:

curl -X POST "http://localhost:8000/generate?prompt=Hello%20World" -H "x-api-key: wrongkey"

Same error should appear.

Using Postman:
- Add a header: Key = x-api-key, Value = your_secret_key.
- Send the request and verify success.
- Remove the header or use an incorrect key and confirm the 401 error.

Step 5: Implement a Credit System (Optional)

Modify main.py for Credits:

from fastapi import FastAPI, Depends, HTTPException, Header
import ollama
import os
from dotenv import load_dotenv

app = FastAPI()
load_dotenv()

# Dictionary to track credits per API key
API_KEY_CREDITS = {os.getenv("API_KEY"): 5}  # 5 credits initially

def verify_api_key(x_api_key: str = Header(None)):
    credits = API_KEY_CREDITS.get(x_api_key, 0)
    if credits <= 0:
        raise HTTPException(status_code=401, detail="Invalid API Key or No Credits")
    return x_api_key

@app.post("/generate")
def generate(prompt: str, api_key: str = Depends(verify_api_key)):
    # Deduct a credit
    API_KEY_CREDITS[api_key] -= 1
    response = ollama.chat(model="mistral", messages=[{"role": "user", "content": prompt}])
    return {"response": response["message"]["content"]}

Test the Credit System:
- Send the POST request with the correct API key 5 times (e.g., using curl or Postman).
- On the 6th attempt, you should receive a 401 error due to no remaining credits.
- Restart the server to reset credits (since this is an in-memory implementation).

Commit Changes:

git add main.py
git commit -m "Added credit system to API"

Step 6: Deploy the API to the Cloud (Advanced Optional)

Containerize with Docker:

Create a Dockerfile:

FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Build and test locally:

docker build -t llm-api .
docker run -p 8000:8000 --env-file .env llm-api

Deploy to AWS ECS (Example):
- Push the Docker image to a registry (e.g., AWS ECR):
```
aws ecr create-repository --repository-name llm-api
docker tag llm-api:latest <your-ecr-uri>:latest
docker push <your-ecr-uri>:latest
```
- Deploy using AWS ECS (Fargate) via the AWS console or CLI.

Set Up CI/CD with GitHub Actions:

Create .github/workflows/deploy.yml:

name: Deploy to AWS ECS
on:
  push:
    branches: [main]
jobs:
  build-and-deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Build and Push Docker Image
        run: |
          docker build -t llm-api .
          docker tag llm-api:latest <your-ecr-uri>:latest
          docker push <your-ecr-uri>:latest

Configure AWS credentials in GitHub Secrets.

Commit Docker and CI/CD Files:

git add Dockerfile .github/workflows/deploy.yml
git commit -m "Added Docker and CI/CD configuration"

Deliverables

Submit the following:

Working FastAPI Application: The main.py file with all implemented features.
API Documentation: A brief document (e.g., Markdown) describing:
- Endpoint: POST /generate
- Parameters: prompt (string), x-api-key (header)
- Example request/response
Security Report: A 1-page explanation of:
- Why securing API access to AI models is critical.
- How API key authentication and credits achieve this in your implementation.

Assessment Criteria

Functionality: API works correctly with unsecured and secured endpoints.
Security: API key authentication is properly implemented and enforced.
Understanding: Security report demonstrates clear comprehension of concepts.
Code Quality: Code is well-structured, commented, and version-controlled.
Optional: Successful cloud deployment and CI/CD setup (if attempted).

Conclusion

This lab provides practical experience in building and securing an API for LLM access, integrating DevOps practices like version control and optional cloud deployment. You’ve learned to use FastAPI, Ollama, and security mechanisms, preparing you for real-world DevOps and Cloud challenges in AI-driven applications. For further exploration, consider enhancing the credit system with a persistent database or integrating more advanced authentication methods like OAuth.

By Wahid Hamdi