Gunicorn: FastAPI in Production

1. Gunicorn Configuration

Gunicorn is a widely used WSGI server for running Python web applications. When deploying FastAPI, Gunicorn is often used in conjunction with Uvicorn to provide a production-ready server.

Install Gunicorn using:

pip install gunicorn

Run Gunicorn with Uvicorn workers:

gunicorn -k uvicorn.workers.UvicornWorker your_app:app -w 4 -b


  • -k uvicorn.workers.UvicornWorker specifies the worker class.
  • your_app:app points to your FastAPI application instance.
  • -w 4 sets the number of worker processes. Adjust this based on the available resources and expected load.

2. Worker Processes

The -w flag in the Gunicorn command determines the number of worker processes. The optimal number depends on factors like CPU cores, available memory, and the nature of your application.

For example, on a machine with four CPU cores:

gunicorn -k uvicorn.workers.UvicornWorker your_app:app -w 4 -b

If your application performs a significant amount of asynchronous I/O operations, you might increase the number of workers. However, keep in mind that too many workers can lead to resource contention.

3. Load Balancing and Scaling

In a production setting, deploying multiple instances of your FastAPI application and distributing incoming requests across them is essential for scalability and fault tolerance. The number of worker processes can impact the optimal scaling strategy.

Consider using tools like nginx for load balancing or deploying your application in a container orchestration system like Kubernetes.

4. Graceful Shutdown

Ensure that Gunicorn handles signals gracefully. FastAPI applications may have asynchronous tasks or background jobs that need to complete before shutting down. Gunicorn’s --graceful-timeout option can be set to allow for graceful termination.

gunicorn -k uvicorn.workers.UvicornWorker your_app:app -w 4 -b --graceful-timeout 60

This allows Gunicorn to wait up to 60 seconds for workers to finish processing before shutting down.

In conclusion, the choice of Gunicorn and worker processes is a crucial aspect of deploying FastAPI applications in a production environment. Fine-tuning the number of workers and configuring Gunicorn parameters according to your application’s characteristics and deployment environment ensures optimal performance and scalability.

Don’t hesitate, we are just a message away!

Leave a Reply

Your email address will not be published. Required fields are marked *

Signup for our newsletter