✨

Generative AI

Concepts and applications of Generative Artificial Intelligence

⏱️ Estimated reading time: 25 minutes

Introduction to Generative AI

Generative AI creates new content (text, images, audio, code) based on patterns learned from training data.

Foundation Models



Pre-trained models with large amounts of data that can be adapted to multiple tasks:
- LLMs: Large Language Models
- Diffusion Models: For image generation
- Multimodal: Process multiple data types

🎯 Key Points

  • βœ“ Foundation models enable fast adaptation (fine-tuning / prompt engineering)
  • βœ“ Understanding limitations (hallucinations, bias) is critical for production use
  • βœ“ Choose model by modality (text, image, multimodal) and latency requirements
  • βœ“ RAG helps reduce hallucinations when up-to-date information is required
  • βœ“ Assess inference and embedding storage costs for scaled solutions

Amazon Bedrock

Fully managed service providing access to AI foundation models through a unified API.

Available Models


- Anthropic Claude: Conversation and reasoning
- Meta Llama: Open-source model
- Amazon Titan: AWS proprietary models
- Stability AI: Image generation
- Cohere: NLP and embeddings

Features


- Customization via fine-tuning
- RAG (Retrieval Augmented Generation)
- AI Agents
- Model evaluation

🎯 Key Points

  • βœ“ Bedrock provides managed access to models without managing infrastructure
  • βœ“ Compare models by cost, customization and privacy requirements
  • βœ“ Use RAG or fine-tuning depending on context needs and data control
  • βœ“ Test models with production-like data before deployment
  • βœ“ Consider latency and SLAs when selecting provider/model

RAG (Retrieval Augmented Generation)

Architecture that improves LLM responses by retrieving relevant information from external sources.

Components


1. Vectorization: Convert documents to embeddings
2. Vector database: Store embeddings (e.g., OpenSearch, Pinecone)
3. Retrieval: Search for relevant information
4. Generation: LLM generates response with retrieved context

Benefits


- Reduces hallucinations
- Up-to-date information
- More accurate responses
- No fine-tuning required

🎯 Key Points

  • βœ“ Vectorization and vector DBs are at the core of RAG
  • βœ“ Choose vector DB by latency, cost and scalability
  • βœ“ Curate knowledge sources and ensure document quality
  • βœ“ Combine retrieval and generation to balance accuracy and fluency
  • βœ“ Measure hallucination reduction and contextual accuracy after RAG deployment