Building and Securing an API for LLM Access with FastAPI and Ollama
This lab demonstrate how to create a secure API using Python, FastAPI, and Ollama to control access to a Large Language Model (LLM) running locally.
Objective
In this lab, you will:
- Build a secure API using FastAPI that interfaces with a locally run LLM via Ollama.
- Implement API key-based authentication to control access.
- Understand the importance of securing access to AI models.
- Test the API using tools like Postman or curl.
- Optionally, deploy the API to a cloud platform using Docker and set up a CI/CD pipeline.
Prerequisites
Before starting, ensure you have:
- Basic Python programming knowledge: Familiarity with Python syntax and libraries.
- Command-line proficiency: Ability to navigate and execute commands in a terminal.
- Understanding of HTTP methods and APIs: Knowledge of GET, POST requests, and API concepts.
- System requirements: A computer with sufficient hardware to run an LLM locally (e.g., 8GB+ RAM, depending on the model).
Lab Setup
Step 1: Install Python and pip
- Ensure Python 3.x and pip are installed on your system.
- Verify installation:
python3 --version pip3 --version
- If not installed, download from python.org.
Step 2: Install Required Libraries
- Create a file named
requirements.txt
with the following content:fastapi uvicorn ollama python-dotenv requests
- Install the dependencies:
pip3 install -r requirements.txt
- fastapi: Framework for building the API.
- uvicorn: ASGI server to run the FastAPI app.
- ollama: Library to interface with the Ollama LLM.
- python-dotenv: For loading environment variables.
- requests: For testing the API from Python.
Step 3: Set Up Ollama
- Download Ollama: Visit ollama.com/download and install it for your OS.
- Pull an LLM model: Open a terminal and run:
This downloads the Mistral model (a lightweight, open-source LLM).
ollama pull mistral
- Verify Ollama: Test it by running:
Type a prompt (e.g., âHello Worldâ) and confirm you get a response. Exit with
ollama run mistral
/bye
.
Lab Steps
Step 1: Create a Simple FastAPI Application
-
Create a Project Directory:
mkdir llm-api-lab cd llm-api-lab
-
Initialize Version Control (DevOps Practice):
git init
-
Write the API Code:
-
Create a file named
main.py
:from fastapi import FastAPI import ollama app = FastAPI() @app.post("/generate") def generate(prompt: str): response = ollama.chat(model="mistral", messages=[{"role": "user", "content": prompt}]) return {"response": response["message"]["content"]}
-
This defines a POST endpoint
/generate
that accepts aprompt
parameter and returns an LLM response.
-
-
Run the Application:
uvicorn main:app --reload
main:app
refers to theapp
object inmain.py
.--reload
enables auto-reloading for development.
-
Commit Your Work:
git add main.py requirements.txt git commit -m "Initial FastAPI app with /generate endpoint"
Step 2: Test the Unsecured API
-
Using curl:
curl -X POST "http://localhost:8000/generate?prompt=Hello%20World"
You should see a JSON response like
{"response": "Hello World! ..."}
. -
Using Postman (Alternative):
- Download Postman from postman.com.
- Create a new request:
- Method: POST
- URL:
http://localhost:8000/generate?prompt=Hello World
- Click âSendâ and verify the response.
Step 3: Secure the API with an API Key
-
Update
main.py
for Security:from fastapi import FastAPI, Depends, HTTPException, Header import ollama import os from dotenv import load_dotenv app = FastAPI() # Load environment variables load_dotenv() API_KEYS = {os.getenv("API_KEY")} # Dependency to verify API key def verify_api_key(x_api_key: str = Header(None)): if x_api_key not in API_KEYS: raise HTTPException(status_code=401, detail="Invalid API Key") return x_api_key @app.post("/generate") def generate(prompt: str, api_key: str = Depends(verify_api_key)): response = ollama.chat(model="mistral", messages=[{"role": "user", "content": prompt}]) return {"response": response["message"]["content"]}
-
Create an
.env
File:- In the project directory, create
.env
:API_KEY=your_secret_key
- Replace
your_secret_key
with a secure key (e.g.,mysecret123
).
- In the project directory, create
-
Add
.env
to.gitignore
(DevOps Practice):- Create
.gitignore
:.env
- Commit changes:
git add main.py .gitignore git commit -m "Added API key authentication"
- Create
-
Restart the Server:
uvicorn main:app --reload
Step 4: Test the Secured API
-
Test with Correct API Key:
- Using curl:
Replace
curl -X POST "http://localhost:8000/generate?prompt=Hello%20World" -H "x-api-key: your_secret_key"
your_secret_key
with the value from.env
. You should get a valid response.
- Using curl:
-
Test with Incorrect/No API Key:
- Without header:
Expect a
curl -X POST "http://localhost:8000/generate?prompt=Hello%20World"
401 Unauthorized
error:{"detail": "Invalid API Key"}
. - With wrong key:
Same error should appear.
curl -X POST "http://localhost:8000/generate?prompt=Hello%20World" -H "x-api-key: wrongkey"
- Without header:
-
Using Postman:
- Add a header: Key =
x-api-key
, Value =your_secret_key
. - Send the request and verify success.
- Remove the header or use an incorrect key and confirm the 401 error.
- Add a header: Key =
Step 5: Implement a Credit System (Optional)
-
Modify
main.py
for Credits:from fastapi import FastAPI, Depends, HTTPException, Header import ollama import os from dotenv import load_dotenv app = FastAPI() load_dotenv() # Dictionary to track credits per API key API_KEY_CREDITS = {os.getenv("API_KEY"): 5} # 5 credits initially def verify_api_key(x_api_key: str = Header(None)): credits = API_KEY_CREDITS.get(x_api_key, 0) if credits <= 0: raise HTTPException(status_code=401, detail="Invalid API Key or No Credits") return x_api_key @app.post("/generate") def generate(prompt: str, api_key: str = Depends(verify_api_key)): # Deduct a credit API_KEY_CREDITS[api_key] -= 1 response = ollama.chat(model="mistral", messages=[{"role": "user", "content": prompt}]) return {"response": response["message"]["content"]}
-
Test the Credit System:
- Send the POST request with the correct API key 5 times (e.g., using curl or Postman).
- On the 6th attempt, you should receive a
401
error due to no remaining credits. - Restart the server to reset credits (since this is an in-memory implementation).
-
Commit Changes:
git add main.py git commit -m "Added credit system to API"
Step 6: Deploy the API to the Cloud (Advanced Optional)
-
Containerize with Docker:
- Create a
Dockerfile
:FROM python:3.9-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
- Build and test locally:
docker build -t llm-api . docker run -p 8000:8000 --env-file .env llm-api
- Create a
-
Deploy to AWS ECS (Example):
- Push the Docker image to a registry (e.g., AWS ECR):
aws ecr create-repository --repository-name llm-api docker tag llm-api:latest <your-ecr-uri>:latest docker push <your-ecr-uri>:latest
- Deploy using AWS ECS (Fargate) via the AWS console or CLI.
- Push the Docker image to a registry (e.g., AWS ECR):
-
Set Up CI/CD with GitHub Actions:
- Create
.github/workflows/deploy.yml
:name: Deploy to AWS ECS on: push: branches: [main] jobs: build-and-deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Build and Push Docker Image run: | docker build -t llm-api . docker tag llm-api:latest <your-ecr-uri>:latest docker push <your-ecr-uri>:latest
- Configure AWS credentials in GitHub Secrets.
- Create
-
Commit Docker and CI/CD Files:
git add Dockerfile .github/workflows/deploy.yml git commit -m "Added Docker and CI/CD configuration"
Deliverables
Submit the following:
- Working FastAPI Application: The
main.py
file with all implemented features. - API Documentation: A brief document (e.g., Markdown) describing:
- Endpoint:
POST /generate
- Parameters:
prompt
(string),x-api-key
(header) - Example request/response
- Endpoint:
- Security Report: A 1-page explanation of:
- Why securing API access to AI models is critical.
- How API key authentication and credits achieve this in your implementation.
Assessment Criteria
- Functionality: API works correctly with unsecured and secured endpoints.
- Security: API key authentication is properly implemented and enforced.
- Understanding: Security report demonstrates clear comprehension of concepts.
- Code Quality: Code is well-structured, commented, and version-controlled.
- Optional: Successful cloud deployment and CI/CD setup (if attempted).
Conclusion
This lab provides practical experience in building and securing an API for LLM access, integrating DevOps practices like version control and optional cloud deployment. Youâve learned to use FastAPI, Ollama, and security mechanisms, preparing you for real-world DevOps and Cloud challenges in AI-driven applications. For further exploration, consider enhancing the credit system with a persistent database or integrating more advanced authentication methods like OAuth.
By Wahid Hamdi