✨

Generative AI

Concepts and applications of Generative Artificial Intelligence

⏱️ Estimated reading time: 25 minutes

Introduction to Generative AI

Generative AI creates new content (text, images, audio, code) based on patterns learned from training data.

Foundation Models

Pre-trained models with large amounts of data that can be adapted to multiple tasks:
- LLMs: Large Language Models
- Diffusion Models: For image generation
- Multimodal: Process multiple data types

🎯 Key Points

✓ Foundation models enable fast adaptation (fine-tuning / prompt engineering)
✓ Understanding limitations (hallucinations, bias) is critical for production use
✓ Choose model by modality (text, image, multimodal) and latency requirements
✓ RAG helps reduce hallucinations when up-to-date information is required
✓ Assess inference and embedding storage costs for scaled solutions

Amazon Bedrock

Fully managed service providing access to AI foundation models through a unified API.

Available Models

- Anthropic Claude: Conversation and reasoning
- Meta Llama: Open-source model
- Amazon Titan: AWS proprietary models
- Stability AI: Image generation
- Cohere: NLP and embeddings

Features

- Customization via fine-tuning
- RAG (Retrieval Augmented Generation)
- AI Agents
- Model evaluation

🎯 Key Points

✓ Bedrock provides managed access to models without managing infrastructure
✓ Compare models by cost, customization and privacy requirements
✓ Use RAG or fine-tuning depending on context needs and data control
✓ Test models with production-like data before deployment
✓ Consider latency and SLAs when selecting provider/model

RAG (Retrieval Augmented Generation)

Architecture that improves LLM responses by retrieving relevant information from external sources.

Components

1. Vectorization: Convert documents to embeddings
2. Vector database: Store embeddings (e.g., OpenSearch, Pinecone)
3. Retrieval: Search for relevant information
4. Generation: LLM generates response with retrieved context

Benefits

- Reduces hallucinations
- Up-to-date information
- More accurate responses
- No fine-tuning required

🎯 Key Points

✓ Vectorization and vector DBs are at the core of RAG
✓ Choose vector DB by latency, cost and scalability
✓ Curate knowledge sources and ensure document quality
✓ Combine retrieval and generation to balance accuracy and fluency
✓ Measure hallucination reduction and contextual accuracy after RAG deployment

← Back to AWS-AIF-C01