Building Serverless APIs with Python and AWS Lambda
How I built a production event-driven API using Python, AWS Lambda, API Gateway, and SQS — and the lessons that only come from running it under real load.
Introduction
When I first heard “serverless,” I thought it was just marketing fluff. No servers? Sure. But after running several production workloads on AWS Lambda, I’ve come to genuinely appreciate what the model offers — especially for event-driven backends where traffic is spiky and unpredictable.
In this post I’ll walk through how I built a serverless API pipeline in Python, the architectural decisions I made, and the rough edges I ran into along the way.
The Problem
We had a data ingestion service that received webhook events from third-party integrations. Traffic was completely unpredictable — quiet for hours, then hundreds of requests per minute during business hours in different time zones. Maintaining an always-on EC2 instance (or even ECS task) for this felt wasteful and operationally heavy.
Lambda was the obvious fit.
Architecture Overview
The final design looked like this:
Client → API Gateway → Lambda (validator) → SQS → Lambda (processor) → RDS (PostgreSQL)
Breaking it into two Lambda functions with SQS in the middle was a deliberate choice. The first function validates and acknowledges the webhook quickly (under 200ms), while the processor handles the heavier database writes asynchronously. This prevents timeouts from the third-party caller and gives us a natural retry buffer via SQS.
Setting Up the Lambda Handler
I kept the handler thin — just routing and response shaping. The real logic lives in separate modules.
import json
import boto3
from validator import validate_event
from publisher import publish_to_queue
sqs = boto3.client("sqs", region_name="ap-southeast-1")
QUEUE_URL = "https://sqs.ap-southeast-1.amazonaws.com/123456789/events-queue"
def handler(event, context):
try:
body = json.loads(event.get("body", "{}"))
validated = validate_event(body)
except ValueError as exc:
return {"statusCode": 400, "body": json.dumps({"error": str(exc)})}
publish_to_queue(sqs, QUEUE_URL, validated)
return {
"statusCode": 202,
"body": json.dumps({"status": "accepted"})
}
The validate_event function uses Python’s dataclasses and raises ValueError on bad input — keeping error handling predictable and easy to test.
Managing Cold Starts
Cold starts were my first real pain point. For the validator Lambda, a cold start of ~800ms was unacceptable for a synchronous API call.
A few things that helped:
1. Keep the deployment package small. I used a Lambda layer for heavy dependencies (boto3, psycopg2) so the function zip itself stayed under 1MB. Smaller packages initialise faster.
2. Provisioned concurrency for the critical path. The validator Lambda serves synchronous HTTP requests, so I configured 2 provisioned concurrency instances during business hours using an EventBridge schedule:
# Terraform snippet — scales up provisioned concurrency at 8am UTC
resource "aws_lambda_provisioned_concurrency_config" "validator" {
function_name = aws_lambda_function.validator.function_name
qualifier = aws_lambda_alias.live.name
provisioned_concurrent_executions = 2
}
3. Lazy-load heavy resources. Database connections and boto3 clients are initialised outside the handler (at module level) so they’re reused across warm invocations, but I used lazy initialisation patterns to avoid the cost during cold starts when the resource isn’t needed.
SQS as the Glue
SQS is what makes the two-Lambda design resilient. If the processor fails (database timeout, schema mismatch), the message goes back to the queue and retries automatically. After 3 failures, it lands in a dead-letter queue (DLQ) where I have a CloudWatch alarm to page me.
# Processor handler — triggered by SQS event source mapping
def processor_handler(event, context):
for record in event["Records"]:
body = json.loads(record["body"])
try:
write_to_db(body)
except Exception as exc:
# Re-raise to let SQS retry this message
raise exc
One gotcha: if you partially succeed (e.g., 8 of 10 messages process fine, 2 fail), SQS retries the entire batch by default. Enable partial batch response to only retry the failed messages:
def processor_handler(event, context):
failures = []
for record in event["Records"]:
try:
write_to_db(json.loads(record["body"]))
except Exception:
failures.append({"itemIdentifier": record["messageId"]})
return {"batchItemFailures": failures}
This alone saved us from a lot of duplicate processing.
Environment Config and Secrets
I keep non-secret config in Lambda environment variables and secrets (database credentials, API keys) in AWS Secrets Manager. I fetch secrets once at cold start and cache them for the lifetime of the execution environment:
import boto3
import json
_secrets_cache = {}
def get_secret(secret_name: str) -> dict:
if secret_name in _secrets_cache:
return _secrets_cache[secret_name]
client = boto3.client("secretsmanager")
response = client.get_secret_value(SecretId=secret_name)
secret = json.loads(response["SecretString"])
_secrets_cache[secret_name] = secret
return secret
Don’t call Secrets Manager on every invocation — you’ll hit rate limits and add 50–100ms of latency unnecessarily.
Observability
Lambda’s default CloudWatch logging is fine for getting started, but in production I added structured logging with python-json-logger so logs are queryable:
import logging
from pythonjsonlogger import jsonlogger
logger = logging.getLogger()
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
handler.setFormatter(jsonlogger.JsonFormatter())
logger.addHandler(handler)
def handler(event, context):
logger.info("Processing event", extra={"request_id": context.aws_request_id})
Combined with CloudWatch Insights, this made debugging specific failures much faster.
What I’d Do Differently
Use Lambda Powertools from the start. The AWS Lambda Powertools for Python library gives you structured logging, tracing, and idempotency utilities with minimal boilerplate. I retrofitted it halfway through and wished I’d started with it.
Test locally with SAM or LocalStack. I spent too long deploying to AWS to test small changes. SAM CLI’s local invoke (sam local invoke) and LocalStack for SQS/DynamoDB would have shortened my feedback loop significantly.
Conclusion
Python and AWS Lambda are a genuinely productive combination for event-driven backends. The operational burden is low once you get the architecture right, and the cost model is hard to beat for spiky workloads.
The key lessons:
- Split synchronous and asynchronous work across Lambda boundaries with SQS in between
- Enable partial batch response on SQS event source mappings
- Cache secrets and boto3 clients at the module level, not inside handlers
- Add structured logging before you need to debug production issues
If you’re evaluating serverless for a similar use case, I’d encourage you to try it. The cold start concerns are real but manageable — and the operational simplicity more than makes up for it.
Related Articles
Docker for Backend Developers: A Practical Introduction
Learn how Docker works, why backend developers need it, and how to containerize your first Python or Go application in under 30 minutes.
Containerising a Backend Service: From Docker to Kubernetes
A practical walkthrough of containerising a Python backend service with Docker, deploying it to Kubernetes on ECS, and the production gaps that only show up once real traffic hits.
Environment Variables Explained: Keeping Secrets Out of Code
Learn what environment variables are and why every developer needs them. This guide covers how to use .env files, os.environ in Python, process.env in Node.js, and best practices.