FastAPI: The Modern-day Framework for AI Applications in Python
In the dynamic and ever-evolving landscape of technology, the ability to develop and deploy AI models with speed, accuracy, and scalability is paramount. The fusion of AI with modern web technologies allows developers to build powerful applications that are not just intelligent, but also highly interactive and accessible. Among the multitude of frameworks and tools available to Python developers today, FastAPI stands out as a front-runner. Here's a brief introduction to FastAPI and why it is becoming an indispensable skill for the contemporary AI engineer.
What is FastAPI?
FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.7+ based on standard Python type hints. The key features are:
-
Fast: The name 'FastAPI' isn't just a catchy moniker. One of its biggest draws is its performance. FastAPI is often reported to be on par with NodeJS and Go in terms of speed, and significantly faster than traditional Python frameworks.
-
Pythonic: FastAPI is designed with the Pythonista in mind. Its syntax and utilization of Python’s type hints make the code clean, intuitive, and easy to read.
-
Automatic Interactive API Documentation: With FastAPI, once your API is built, it automatically provides an interactive documentation (using tools like Swagger UI and ReDoc) allowing users and developers to understand and test your API's endpoints seamlessly.
FastAPI and AI: A Power Pairing
AI engineers work diligently to design, train, and fine-tune models that can then be served to end-users. But creating a model is just half the battle; presenting it in a usable format through an API is equally important. FastAPI simplifies this process. Here’s why it's particularly apt for AI:
-
Seamless Integration with ML Libraries: FastAPI can easily work alongside popular ML libraries like TensorFlow, PyTorch, and Scikit-learn, ensuring that the transition from model development to deployment is smooth.
-
Asynchronous Capabilities: Many AI applications, especially those involving deep learning, can be compute-intensive. FastAPI's support for asynchronous request handling means that even while a model is processing a request, the API remains responsive.
-
Data Validation: FastAPI's use of Python type hints doesn't just make for cleaner code; it provides automatic request validation. This is particularly useful for AI apps where the format and type of input data are critical for model inference.
Why Every AI Engineer Should Consider Learning FastAPI
In the age of AI-driven applications, the bridge between robust machine learning models and end-user accessibility is a well-designed API. FastAPI facilitates this bridge. It allows AI engineers to focus on what they do best - designing and refining models - while ensuring that the deployment and scaling process is streamlined, efficient, and hassle-free.
The modern AI engineer isn't just a model builder; they are a solution provider. And in the world of solutions, FastAPI is becoming an increasingly essential tool in the AI engineer’s arsenal.
As you delve deeper into the chapters of this book, you'll gain insights into how FastAPI can be harnessed to bring your AI applications to life, making them accessible to the world, one endpoint at a time.
To create a FastAPI service that serves the purpose, we'll follow these steps:
- Set up FastAPI.
- Define the route (or endpoint) that accepts the text input via GET request.
- Integrate the provided code into the defined route.
- Return the result as JSON.
Below is the FastAPI service:
from fastapi import FastAPI
from annoy import AnnoyIndex
import sqlite3
from sentence_transformers import SentenceTransformer
from typing import List, Dict
app = FastAPI()
# Load the Sentence Transformer Model
model = SentenceTransformer('all-MiniLM-L6-v2')
VEC_INDEX_DIM = 384
# Load the Annoy index
u = AnnoyIndex(VEC_INDEX_DIM, 'angular')
u.load("/kaggle/working/vecindex.ann")
# SQLite connection
con = sqlite3.connect("/kaggle/working/chunks.db")
cur = con.cursor()
@app.get("/find_similar_text/", response_model=List[Dict[str, str]])
async def read_similar_text(query_text: str):
"""
Given a query_text, find the top 10 text chunks from the database that are semantically similar.
"""
# Convert the query text into an embedding
embedding = model.encode([query_text])
input_vec = embedding[0]
# Retrieve the IDs of the top 10 most similar text chunks
chunk_ids = u.get_nns_by_vector(input_vec, 10, search_k=-1, include_distances=False)
# Fetch the actual text chunks from the SQLite database
list_chunk_ids = ','.join([str(k) for k in chunk_ids])
cur.execute("select chunk_id, chunk_text from pdf_chunks where chunk_id in (" + list_chunk_ids + ")")
res = cur.fetchall()
# Construct the result list
result = [{"chunk_id": str(chunk[0]), "chunk_text": chunk[1]} for chunk in res]
return result
# You would then run this API using a tool like Uvicorn and send GET requests to the defined endpoint.
To test this FastAPI service:
Run the FastAPI app using Uvicorn.
Use the endpoint /find_similar_text/?query_text=YOUR_TEXT_HERE
to query for similar texts.
This FastAPI service provides a convenient and efficient way to query for similar texts, making it highly useful in various NLP applications.
If you are developing in Kaggle, you can install ngrok and get a public URL where your FastAPI service will be deployed.
ngrok puts localhost on the internet.
!pip install fastapi nest-asyncio pyngrok uvicorn
import nest_asyncio
from pyngrok import ngrok
import uvicorn
# specify a port
port = 8000
ngrok_tunnel = ngrok.connect(port)
# where we can visit our fastAPI app
print('Public URL:', ngrok_tunnel.public_url)
nest_asyncio.apply()
# finally run the app
uvicorn.run(app, port=port)
After you receive the URL you have to authenticate on the ngrok site and configure an auth-token. On receiving this token you have to configure in a notebook cell and run the code snippet shown above again.
!ngrok config add-authtoken <<auth_token>>
Replace <<auth_token>>
with the token you receive on the ngrok website.
To test the API use Postman