AI Product Development from 0 to 1

In an age where Artificial Intelligence is more than just a buzzword, mastery of its many facets is not only beneficial but crucial for the engineers of tomorrow.

The material here offers a comprehensive and hands-on introduction to the many components of AI product development, a journey that spans from the rudiments of processing data in PDFs to the intricate art of Retrieval Augmented Generation using state-of-the-art models like ChatGPT 3.5.

This book has been meticulously crafted for both novices and professionals, unraveling Vector Embeddings, the efficiencies of Nearest Neighbors, the scalability brought in by libraries like Annoy, and the robustness of web services via FastAPI.

Furthermore, understand the approach of Docker deployments, ensuring your AI solutions are versatile and avaialble to anyone online.

Whether you're an AI enthusiast, a budding engineer, or a seasoned developer, "AI Unveiled" promises to be an enlightening companion on your quest for AI excellence.

Welcome aboard this transformative journey!

Harsh Singhal

Content

Learn how to process and extract content from PDF documents.

Processing PDF Documents

Use SQLite to store the extracted content and also learn how to do full-text search using SQLite.

Search with SQLite

Text can be converted to numeric representation called Vector Embeddings. And once you have vectors you can find other similar vectors using Nearest Neighbor algorithms. Learn how to extract Vector Embeddings from text and find similar vectors using Nearest Neighbors in scikit-learn.

Search with Nearest Neighbors

Nearest Neighbor algorithms have variants which allow them to scale across large number of vectors (millions and more). Annoy, a Python library created at Spotify lets you create a Vector index and run approximate Nearest Neighbor algorithms for vector search.

Approximate Nearest Neighbors

FastAPI is a popular Python library to create a web service. These web services can be deployed on the cloud and users can make a call to your web service to run a Semantic Search query.

FastAPI Service

Don't worry about creating an environment on the cloud for your code to run. Create a Docker image with all the dependencies and run your image as a container anywhere. This has been a game-changer for developers who want to try things out easily (Docker images for almost all technologies are available and can run anywhere, even on your Windows laptop) and deploy even more easily.

Deploy with Docker

You may have heard of RAG solutions (if not, Google it now) and popular libraries like langchain or llama_index. Before you start using these libraries, learn how they work by building a RAG solution from scratch. So far in the previous chapters you have developed 80% of the components necessary and now you will call a Large Language Model from OpenAI, ChatGPT 3.5 for the final step.

Retrieval Augmented Generation

Supplements

Kaggle Notebook with all the code