LLMs

Jump to bottom

Melvin Carvalho edited this page Dec 28, 2024 · 1 revision

LLMs (with questoins)

Core Concepts
- Understanding LLMs
  - Architecture
    - Transformers
      - What is the core mechanism of a Transformer?
      - Explain the role of multi-head attention.
    - Attention Mechanisms
      - How does self-attention work?
      - What are the limitations of the self-attention mechanism?
    - Positional Encoding
      - Why is positional encoding necessary in Transformers?
    - Model Variants (Encoder, Decoder, etc.)
      - What are the key differences between encoder and decoder models?
  - Training
    - Pretraining
      - What is pre-training, and why is it crucial for LLMs?
    - Fine-Tuning
      - Explain the concept of fine-tuning LLMs.
    - Supervised Methods
      - When would you choose supervised fine-tuning?
    - Unsupervised Methods
      - What are unsupervised methods for LLM training?
  - Tokenization
    - Vocabulary
      - What is a token in the context of LLMs?
      - How does vocabulary size affect LLM performance?
    - Token Types
      - What are common types of tokens in language models?
    - Encoding Schemes
      - What is the difference between BPE and WordPiece tokenization?
  - Inference
    - Decoding Strategies (e.g., Greedy, Beam)
      - Compare and contrast greedy and beam search.
      - When would you use each decoding method?
    - Temperature and Sampling
      - What does the 'temperature' parameter control in LLMs?
    - Stopping Criteria
      - How do you define stopping criteria in LLMs?
  - Limitations
    - Hallucination
      - What is hallucination in LLMs, and how can you detect it?
    - Bias
      - How can LLMs exhibit bias, and what can we do about it?
    - Context Window
      - What is the context window of an LLM, and why is it important?
- Information Retrieval & Knowledge Representation
  - Vector Embeddings
    - Semantic Representation
      - What are vector embeddings, and why are they important?
      - How do they represent meaning?
    - Embedding Models
      - What is an embedding model?
      - How do you choose the right embedding model?
    - Distance Metrics
      - What distance metrics are used to compare embeddings?
  - Vector Databases
    - Indexing Techniques
      - Explain different indexing techniques in vector databases.
      - When would you choose each one?
    - Similarity Search
      - How does a vector database perform similarity searches?
    - Filtering
      - What are challenges associated with filtering in a vector DB?
  - Chunking
    - Data Segmentation
      - Why is chunking data necessary in RAG?
    - Chunking Strategies
      - What are different chunking strategies?
    - Size Optimization
      - How do you determine the ideal chunk size?
  - RAG
    - Retrieval Techniques * How does the retrieval component work in RAG?
    - Augmentation Techniques * What is the augmentation part of RAG?
    - Hybrid Approaches * What are hybrid approaches in RAG and how do they improve results?
- Prompt Engineering & Control
  - Prompt Design
    - Basic Structure
      - What are the basic components of a prompt?
    - Prompting Techniques (Few-Shot, CoT)
      - Explain few-shot prompting with examples.
      - What is Chain of Thought prompting?
    - Role-Playing
      - How can you use role-playing in prompting?
  - Prompt Optimization
    - Improving Reasoning * How do you improve reasoning ability through prompt engineering?
    - Handling Ambiguity * What prompt engineering strategies help with ambiguity?
    - Controlling Bias * How do you control bias with prompt engineering?
  - Hallucination Control
    - Prompt-Based Strategies
      - How to use prompts to control LLM hallucination?
    - External Knowledge
      - How can external knowledge reduce hallucinations?
    - Verification
      - How can verification techniques help reduce hallucinations?
Advanced Techniques
- Fine-Tuning & Adaptation
  - Supervised Fine-Tuning (SFT)
    - Dataset Creation
      - How do you create effective datasets for fine-tuning Q&A tasks?
    - Hyperparameter Tuning
      - How do you set hyperparameters for fine tuning?
    - Catastrophic Forgetting
      - What is catastrophic forgetting?
  - Preference Alignment
    - RLHF
      - What is RLHF and how is it used?
    - DPO
      - Explain how DPO works?
    - Reward Hacking
      - What is the reward hacking issue in RLHF?
  - Parameter-Efficient Fine-Tuning (PEFT)
    - Adapter Layers
      - What are adapter layers?
    - LoRA
      - Explain LoRA and how it works?
    - Prefix Tuning
      - Explain prefix tuning method
- Advanced Search
  - Re-Ranking
    - Re-ranking Models * Why is re-ranking needed?
    - Fine-Tuning * How do you fine-tune re-ranking models?
  - Information Retrieval Metrics
    - Relevance Metrics * What are metrics used to evaluate relevance in information retrieval?
    - Accuracy Metrics * Which accuracy metrics should I use for information retrieval?
    - Ranking Metrics * How do you assess the quality of a ranking system?
  - Hybrid Search
    - Combining Methods
      - How does hybrid search work?
    - Homogenization
      - How do you homogenize results from multiple search methods?
- Agent Systems
  - Agent Frameworks
    - Planning * What are basic concepts of agent planning?
    - Execution * What are basic concepts of agent execution?
    - Memory Management * What is agent memory and why is it important?
  - Tools & APIs
    - Function Calling * How do agents utilize function calling?
    - External Tools Integration * Why do we need to connect agents to external tools?
Deployment & Evaluation
- Deployment Strategies
  - Inference Optimization
    - Quantization * Why does quantization not decrease accuracy?
    - Model Pruning
      - How can we reduce model size using pruning?
    - Parallel Processing
      - How does parallel processing improve inference time?
  - Infrastructure Considerations
    - Cost Management * How do you optimize the cost of an LLM system?
    - Scalability * How do you design for scalable deployment?
    - Latency * How can you reduce latency of your LLM application?
- Evaluation & Metrics
  - LLM Evaluation
    - Benchmarking * How do you benchmark your LLM for your use-case?
    - Human Evaluation * Why is human evaluation needed when evaluating LLMs?
    - Automatic Metrics * What automatic metrics can we use to evaluate LLMs?
  - RAG Evaluation
    - Retrieval Accuracy * How do you evaluate the retrieval component of your RAG?
    - Generation Quality * How do you evaluate the generated response of your RAG system?
Security & Misc
- Prompt Hacking
  - Types of Attacks
    - What are different types of prompt hacking?
  - Defense Mechanisms
    - What are some defense mechanisms against prompt hacking?
- Miscellaneous Topics
  - Cost Optimization
    - How do you reduce the overall cost of your LLM system?
  - Mixture of Experts (MoE)
    - What is a Mixture of Experts model?
  - Hardware Considerations
    - How does hardware choices affect training and inference?
  - Low-Precision Training
    - How can we use low-precision training?