🤖 Large Language Model (LLM) Primers
With the advent of ChatGPT, LLMs have been the talk of the town! We’ve recently seen a bunch of extraordinary advancements in the NLP space including GPT-4, LLaMA, Toolformer, RLHF, Visual ChatGPT, etc.
📝 Here are some primers on recent LLMs and related concepts to get you up to speed:
🔹 ChatGPT (http://chatgpt.vinija.ai)
- Training Process
- Detecting ChatGPT generated text
- Related: InstructGPT
🔹 Prompt Engineering (http://prompting.aman.ai)
- Zero-shot/Few-shot Prompting
- Chain-of-Thought Prompting
- Instruction Prompting and Tuning
- Large Prompt Context
🔹 Reinforcement Learning from Human Feedback a.k.a. RLHF (http://rlhf.vinija.ai)
- Refresher: Basics of RL
- Training Process (Pretraining Language Models, Training a Reward Model, Fine-tuning the LM with RL)
- Bias Concerns and Mitigation Strategies
🔹 LLaMA (http://llama.aman.ai)
- Training Process (Pre-normalization, SwiGLU Activation Function, Rotary Positional Embeddings, Flash Attention)
- Visual Summary
🔹 Toolformer (https://lnkd.in/gc5SYsEu)
- Approach
- Sampling and Executing API Calls
- Experimental Results
🔹 Visual ChatGPT (http://vchatgpt.vinija.ai)
- System Architecture
- Managing Multiple Visual Foundation Models
- Handling Queries
- Limitations
🔹 GPT-4 (http://gpt4.vinija.ai)
- Capabilities of GPT-4
- GPT-4 vs. GPT-3
Notes written in collaboration with Aman Chadha
#artificialintelligence #machinelearning #ai #ml #nlp
Large Language Models (LLMs) - very useful information ! What are your views on practical use cases? Where are we heading with this? Question: why is it that even these State Of The Art Large Language Models do not seem capable of solving cryptic crossword clues? What they can do though, is doing the reverse task: giving an explanation of why a specific answer is the correct answer to a specific cryptic clue! (But not always the correct answer, although sometimes the explanation is wrong but very creative or humorous!). My view at this moment is, that LLMs are at least very useful in a brainstorming setting and they force a human to very precisely formulating the question you are asking to the LLM (prompt engineering). It thereby sharpens your brain.