Other Resources
Bonus Resources!
HyperTech Core v0.1.0 ☄️
- GitHub (opens in a new tab)
- VSCode (opens in a new tab)
- Jupyter (opens in a new tab)
- HuggingFace (opens in a new tab)
- LM Studio (opens in a new tab)
- Cog (opens in a new tab)
- llama.cpp (opens in a new tab)
- koboldcpp (opens in a new tab)
- exllama (opens in a new tab)
- text-generation-webui (opens in a new tab)
- stable-diffusion-webui (opens in a new tab)
HYPERION 🪐
- (coming soon!)
Resources ✨
- FOSAI ▲ XYZ (opens in a new tab)
- Machine Learning (opens in a new tab)
- OpenAI Cookbook (opens in a new tab)
- Pinecone Examples (opens in a new tab)
- NVIDIA NeMo (opens in a new tab)
- LangChain (opens in a new tab)
- LlamaIndex (opens in a new tab)
YouTube 📺
- Matthew Berman️ (opens in a new tab)
- Nicholas Renotte️ (opens in a new tab)
- Dave Ebbelaar (opens in a new tab)
- James Briggs (opens in a new tab)
- SentDex (opens in a new tab)
- AI Jason (opens in a new tab)
- IBM (opens in a new tab)
Build 🏗️
- GitHub (opens in a new tab)
- VSCode (opens in a new tab)
- Jupyter (opens in a new tab)
- Colab (opens in a new tab)
- Hex (opens in a new tab)
- Vercel (opens in a new tab)
- Replicate (opens in a new tab)
- Cerebrium (opens in a new tab)
Compute ⚡
- RunPod (opens in a new tab)
- VastAI (opens in a new tab)
- Lambda (opens in a new tab)
- watsonx (opens in a new tab)
- SageMaker (opens in a new tab)
- Azure (opens in a new tab)
R&D 🧪
Bonus 🛸
- GitHub Projects (opens in a new tab)
- Attention is All You Need (opens in a new tab)
- Python Programming (opens in a new tab)
Looking for all of the other cool technologies being developed in the space? Checkout my GitHub Stars for tons of really interesting projects that are FOSS & FOSAI.
Awesome-LLM
The content below is from Awesome-LLM (opens in a new tab).
Base LLM
Instruction Finetuned LLM
- LLaMA (opens in a new tab) - A foundational, 65-billion-parameter large language model. LLaMA.cpp (opens in a new tab) Lit-LLaMA (opens in a new tab)
- Alpaca (opens in a new tab) - A model fine-tuned from the LLaMA 7B model on 52K instruction-following demonstrations. Alpaca.cpp (opens in a new tab) Alpaca-LoRA (opens in a new tab)
- Flan-Alpaca (opens in a new tab) - Instruction Tuning from Humans and Machines.
- Baize (opens in a new tab) - Baize is an open-source chat model trained with LoRA (opens in a new tab). It uses 100k dialogs generated by letting ChatGPT chat with itself.
- Cabrita (opens in a new tab) - A portuguese finetuned instruction LLaMA.
- Vicuna (opens in a new tab) - An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality.
- Llama-X (opens in a new tab) - Open Academic Research on Improving LLaMA to SOTA LLM.
- Chinese-Vicuna (opens in a new tab) - A Chinese Instruction-following LLaMA-based Model.
- GPTQ-for-LLaMA (opens in a new tab) - 4 bits quantization of LLaMA (opens in a new tab) using GPTQ (opens in a new tab).
- GPT4All (opens in a new tab) - Demo, data, and code to train open-source assistant-style large language model based on GPT-J and LLaMa.
- Koala (opens in a new tab) - A Dialogue Model for Academic Research
- BELLE (opens in a new tab) - Be Everyone's Large Language model Engine
- StackLLaMA (opens in a new tab) - A hands-on guide to train LLaMA with RLHF.
- RedPajama (opens in a new tab) - An Open Source Recipe to Reproduce LLaMA training dataset.
- Chimera (opens in a new tab) - Latin Phoenix.
- WizardLM|WizardCoder (opens in a new tab) - Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder.
- CaMA (opens in a new tab) - a Chinese-English Bilingual LLaMA Model.
- Orca (opens in a new tab) - Microsoft's finetuned LLaMA model that reportedly matches GPT3.5, finetuned against 5M of data, ChatGPT, and GPT4
- BayLing (opens in a new tab) - an English/Chinese LLM equipped with advanced language alignment, showing superior capability in English/Chinese generation, instruction following and multi-turn interaction.
- UltraLM (opens in a new tab) - Large-scale, Informative, and Diverse Multi-round Chat Models.
- Guanaco (opens in a new tab) - QLoRA tuned LLaMA
- BLOOM (opens in a new tab) - BigScience Large Open-science Open-access Multilingual Language Model BLOOM-LoRA (opens in a new tab)
- BLOOMZ&mT0 (opens in a new tab) - a family of models capable of following human instructions in dozens of languages zero-shot.
- Phoenix (opens in a new tab)
- T5 (opens in a new tab) - Text-to-Text Transfer Transformer
- T0 (opens in a new tab) - Multitask Prompted Training Enables Zero-Shot Task Generalization
- OPT (opens in a new tab) - Open Pre-trained Transformer Language Models.
- UL2 (opens in a new tab) - a unified framework for pretraining models that are universally effective across datasets and setups.
- GLM (opens in a new tab)- GLM is a General Language Model pretrained with an autoregressive blank-filling objective and can be finetuned on various natural language understanding and generation tasks.
- ChatGLM-6B (opens in a new tab) - ChatGLM-6B 是一个开源的、支持中英双语的对话语言模型,基于 General Language Model (GLM) (opens in a new tab) 架构,具有 62 亿参数.
- ChatGLM2-6B (opens in a new tab) - An Open Bilingual Chat LLM | 开源双语对话语言模型
- RWKV (opens in a new tab) - Parallelizable RNN with Transformer-level LLM Performance.
- ChatRWKV (opens in a new tab) - ChatRWKV is like ChatGPT but powered by my RWKV (100% RNN) language model.
- StableLM (opens in a new tab) - Stability AI Language Models.
- YaLM (opens in a new tab) - a GPT-like neural network for generating and processing text. It can be used freely by developers and researchers from all over the world.
- GPT-Neo (opens in a new tab) - An implementation of model & data parallel GPT3 (opens in a new tab)-like models using the mesh-tensorflow (opens in a new tab) library.
- GPT-J (opens in a new tab) - A 6 billion parameter, autoregressive text generation model trained on The Pile (opens in a new tab).
- Dolly (opens in a new tab) - a cheap-to-build LLM that exhibits a surprising degree of the instruction following capabilities exhibited by ChatGPT.
- Pythia (opens in a new tab) - Interpreting Autoregressive Transformers Across Time and Scale
- Dolly 2.0 (opens in a new tab) - the first open source, instruction-following LLM, fine-tuned on a human-generated instruction dataset licensed for research and commercial use.
- OpenFlamingo (opens in a new tab) - an open-source reproduction of DeepMind's Flamingo model.
- Cerebras-GPT (opens in a new tab) - A Family of Open, Compute-efficient, Large Language Models.
- GALACTICA (opens in a new tab) - The GALACTICA models are trained on a large-scale scientific corpus.
- GALPACA (opens in a new tab) - GALACTICA 30B fine-tuned on the Alpaca dataset.
- Palmyra (opens in a new tab) - Palmyra Base was primarily pre-trained with English text.
- Camel (opens in a new tab) - a state-of-the-art instruction-following large language model designed to deliver exceptional performance and versatility.
- h2oGPT (opens in a new tab)
- PanGu-α (opens in a new tab) - PanGu-α is a 200B parameter autoregressive pretrained Chinese language model develped by Huawei Noah's Ark Lab, MindSpore Team and Peng Cheng Laboratory.
- MOSS (opens in a new tab) - MOSS是一个支持中英双语和多种插件的开源对话语言模型.
- Open-Assistant (opens in a new tab) - a project meant to give everyone access to a great chat based large language model.
- HuggingChat (opens in a new tab) - Powered by Open Assistant's latest model – the best open source chat model right now and @huggingface Inference API.
- StarCoder (opens in a new tab) - Hugging Face LLM for Code
- MPT-7B (opens in a new tab) - Open LLM for commercial use by MosaicML
- Falcon (opens in a new tab) - Falcon LLM is a foundational large language model (LLM) with 40 billion parameters trained on one trillion tokens. TII has now released Falcon LLM – a 40B model.
- XGen (opens in a new tab) - Salesforce open-source LLMs with 8k sequence length.
- baichuan-7B (opens in a new tab) - baichuan-7B 是由百川智能开发的一个开源可商用的大规模预训练语言模型.
- Aquila (opens in a new tab) - 悟道·天鹰语言大模型是首个具备中英双语知识、支持商用许可协议、国内数据合规需求的开源语言大模型。
LLM Training Frameworks
- DeepSpeed (opens in a new tab) - DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
- Megatron-DeepSpeed (opens in a new tab) - DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.
- FairScale (opens in a new tab) - FairScale is a PyTorch extension library for high performance and large scale training.
- Megatron-LM (opens in a new tab) - Ongoing research training transformer models at scale.
- Colossal-AI (opens in a new tab) - Making large AI models cheaper, faster, and more accessible.
- BMTrain (opens in a new tab) - Efficient Training for Big Models.
- Mesh Tensorflow (opens in a new tab) - Mesh TensorFlow: Model Parallelism Made Easier.
- maxtext (opens in a new tab) - A simple, performant and scalable Jax LLM!
- Alpa (opens in a new tab) - Alpa is a system for training and serving large-scale neural networks.
- GPT-NeoX (opens in a new tab) - An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Tools for Deploying LLMs
- FastChat (opens in a new tab) - A distributed multi-model LLM serving system with web UI and OpenAI-compatible RESTful APIs.
- SkyPilot (opens in a new tab) - Run LLMs and batch jobs on any cloud. Get maximum cost savings, highest GPU availability, and managed execution -- all with a simple interface.
- vLLM (opens in a new tab) - A high-throughput and memory-efficient inference and serving engine for LLMs
- Text Generation Inference (opens in a new tab) - A Rust, Python and gRPC server for text generation inference. Used in production at HuggingFace (opens in a new tab) to power LLMs api-inference widgets.
- Haystack (opens in a new tab) - an open-source NLP framework that allows you to use LLMs and transformer-based models from Hugging Face, OpenAI and Cohere to interact with your own data.
- Sidekick (opens in a new tab) - Data integration platform for LLMs.
- LangChain (opens in a new tab) - Building applications with LLMs through composability
- wechat-chatgpt (opens in a new tab) - Use ChatGPT On Wechat via wechaty
- promptfoo (opens in a new tab) - Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality.
- Agenta (opens in a new tab) - Easily build, version, evaluate and deploy your LLM-powered apps.
Tutorials About LLMs
- [Andrej Karpathy] State of GPT video (opens in a new tab)
- [Hyung Won Chung] Instruction finetuning and RLHF lecture Youtube (opens in a new tab)
- [Jason Wei] Scaling, emergence, and reasoning in large language models Slides (opens in a new tab)
- [Susan Zhang] Open Pretrained Transformers Youtube (opens in a new tab)
- [Ameet Deshpande] How Does ChatGPT Work? Slides (opens in a new tab)
- [Yao Fu] 预训练,指令微调,对齐,专业化:论大语言模型能力的来源 Bilibili (opens in a new tab)
- [Hung-yi Lee] ChatGPT 原理剖析 Youtube (opens in a new tab)
- [Jay Mody] GPT in 60 Lines of NumPy Link (opens in a new tab)
- [ICML 2022] Welcome to the "Big Model" Era: Techniques and Systems to Train and Serve Bigger Models Link (opens in a new tab)
- [NeurIPS 2022] Foundational Robustness of Foundation Models Link (opens in a new tab)
- [Andrej Karpathy] Let's build GPT: from scratch, in code, spelled out. Video (opens in a new tab)|Code (opens in a new tab)
- [DAIR.AI] Prompt Engineering Guide Link (opens in a new tab)
- [邱锡鹏] 大型语言模型的能力分析与应用 Slides | Video (opens in a new tab)
- [Philipp Schmid] Fine-tune FLAN-T5 XL/XXL using DeepSpeed & Hugging Face Transformers Link (opens in a new tab)
- [HuggingFace] Illustrating Reinforcement Learning from Human Feedback (RLHF) Link (opens in a new tab)
- [HuggingFace] What Makes a Dialog Agent Useful? Link (opens in a new tab)
- [张俊林]通向AGI之路:大型语言模型(LLM)技术精要 Link (opens in a new tab)
- [大师兄]ChatGPT/InstructGPT详解 Link (opens in a new tab)
- [HeptaAI]ChatGPT内核:InstructGPT,基于反馈指令的PPO强化学习 Link (opens in a new tab)
- [Yao Fu] How does GPT Obtain its Ability? Tracing Emergent Abilities of Language Models to their Sources Link (opens in a new tab)
- [Stephen Wolfram] What Is ChatGPT Doing … and Why Does It Work? Link (opens in a new tab)
- [Jingfeng Yang] Why did all of the public reproduction of GPT-3 fail? Link (opens in a new tab)
- [Hung-yi Lee] ChatGPT (可能)是怎麼煉成的 - GPT 社會化的過程 Video (opens in a new tab)
- [Keyvan Kambakhsh] Pure Rust implementation of a minimal Generative Pretrained Transformer code (opens in a new tab)
Courses About LLMs
- [DeepLearning.AI] ChatGPT Prompt Engineering for Developers Homepage (opens in a new tab)
- [Princeton] Understanding Large Language Models Homepage (opens in a new tab)
- [OpenBMB] 大模型公开课 主页 (opens in a new tab)
- [Stanford] CS224N-Lecture 11: Prompting, Instruction Finetuning, and RLHF Slides (opens in a new tab)
- [Stanford] CS324-Large Language Models Homepage (opens in a new tab)
- [Stanford] CS25-Transformers United V2 Homepage (opens in a new tab)
- [Stanford Webinar] GPT-3 & Beyond Video (opens in a new tab)
- [李沐] InstructGPT论文精读 Bilibili (opens in a new tab) Youtube (opens in a new tab)
- [陳縕儂] OpenAI InstructGPT 從人類回饋中學習 ChatGPT 的前身 Youtube (opens in a new tab)
- [李沐] HELM全面语言模型评测 Bilibili (opens in a new tab)
- [李沐] GPT,GPT-2,GPT-3 论文精读 Bilibili (opens in a new tab) Youtube (opens in a new tab)
- [Aston Zhang] Chain of Thought论文 Bilibili (opens in a new tab) Youtube (opens in a new tab)
- [MIT] Introduction to Data-Centric AI Homepage (opens in a new tab)
Opinions about LLMs
- A Stage Review of Instruction Tuning (opens in a new tab) [2023-06-29] [Yao Fu]
- LLM Powered Autonomous Agents (opens in a new tab) [2023-06-23] [Lilian]
- Why you should work on AI AGENTS! (opens in a new tab) [2023-06-22] [Andrej Karpathy]
- Google "We Have No Moat, And Neither Does OpenAI" (opens in a new tab) [2023-05-05]
- AI competition statement (opens in a new tab) [2023-04-20] [petergabriel]
- 我的大模型世界观 (opens in a new tab) [2023-04-23] [陆奇]
- Prompt Engineering (opens in a new tab) [2023-03-15] [Lilian]
- Noam Chomsky: The False Promise of ChatGPT (opens in a new tab) [2023-03-08][Noam Chomsky]
- Is ChatGPT 175 Billion Parameters? Technical Analysis (opens in a new tab) [2023-03-04][Owen]
- Towards ChatGPT and Beyond (opens in a new tab) [2023-02-20][知乎][欧泽彬]
- 追赶ChatGPT的难点与平替 (opens in a new tab) [2023-02-19][李rumor]
- 对话旷视研究院张祥雨|ChatGPT的科研价值可能更大 (opens in a new tab) [2023-02-16][知乎][旷视科技]
- 关于ChatGPT八个技术问题的猜想 (opens in a new tab) [2023-02-15][知乎][张家俊]
- ChatGPT发展历程、原理、技术架构详解和产业未来 (opens in a new tab) [2023-02-15][知乎][陈巍谈芯]
- 对ChatGPT的二十点看法 (opens in a new tab) [2023-02-13][知乎][熊德意]
- ChatGPT-所见、所闻、所感 (opens in a new tab) [2023-02-11][知乎][刘聪NLP]
- The Next Generation Of Large Language Models (opens in a new tab) [2023-02-07][Forbes]
- Large Language Model Training in 2023 (opens in a new tab) [2023-02-03][Cem Dilmegani]
- What Are Large Language Models Used For? (opens in a new tab) [2023-01-26][NVIDIA]
- Large Language Models: A New Moore's Law (opens in a new tab) [2021-10-26][Huggingface]
Other Awesome Lists
- LLMsPracticalGuide (opens in a new tab) - A curated (still actively updated) list of practical guide resources of LLMs
- Awesome ChatGPT Prompts (opens in a new tab) - A collection of prompt examples to be used with the ChatGPT model.
- awesome-chatgpt-prompts-zh (opens in a new tab) - A Chinese collection of prompt examples to be used with the ChatGPT model.
- Awesome ChatGPT (opens in a new tab) - Curated list of resources for ChatGPT and GPT-3 from OpenAI.
- Chain-of-Thoughts Papers (opens in a new tab) - A trend starts from "Chain of Thought Prompting Elicits Reasoning in Large Language Models.
- Instruction-Tuning-Papers (opens in a new tab) - A trend starts from
Natrural-Instruction
(ACL 2022),FLAN
(ICLR 2022) andT0
(ICLR 2022). - LLM Reading List (opens in a new tab) - A paper & resource list of large language models.
- Reasoning using Language Models (opens in a new tab) - Collection of papers and resources on Reasoning using Language Models.
- Chain-of-Thought Hub (opens in a new tab) - Measuring LLMs' Reasoning Performance
- Awesome GPT (opens in a new tab) - A curated list of awesome projects and resources related to GPT, ChatGPT, OpenAI, LLM, and more.
- Awesome GPT-3 (opens in a new tab) - a collection of demos and articles about the OpenAI GPT-3 API (opens in a new tab).
- Awesome LLM Human Preference Datasets (opens in a new tab) - a collection of human preference datasets for LLM instruction tuning, RLHF and evaluation.
- RWKV-howto (opens in a new tab) - possibly useful materials and tutorial for learning RWKV.
- ModelEditingPapers (opens in a new tab) - A paper & resource list on model editing for large language models.
- Awesome LLM Security (opens in a new tab) - A curation of awesome tools, documents and projects about LLM Security.
Other Useful Resources
- Arize-Phoenix (opens in a new tab) - Open-source tool for ML observability that runs in your notebook environment. Monitor and fine tune LLM, CV and Tabular Models.
- Emergent Mind (opens in a new tab) - The latest AI news, curated & explained by GPT-4.
- ShareGPT (opens in a new tab) - Share your wildest ChatGPT conversations with one click.
- Major LLMs + Data Availability (opens in a new tab)
- 500+ Best AI Tools (opens in a new tab)
- Cohere Summarize Beta (opens in a new tab) - Introducing Cohere Summarize Beta: A New Endpoint for Text Summarization
- chatgpt-wrapper (opens in a new tab) - ChatGPT Wrapper is an open-source unofficial Python API and CLI that lets you interact with ChatGPT.
- Open-evals (opens in a new tab) - A framework extend openai's Evals (opens in a new tab) for different language model.
- Cursor (opens in a new tab) - Write, edit, and chat about your code with a powerful AI.
- AutoGPT (opens in a new tab) - an experimental open-source application showcasing the capabilities of the GPT-4 language model.
- OpenAGI (opens in a new tab) - When LLM Meets Domain Experts.
- HuggingGPT (opens in a new tab) - Solving AI Tasks with ChatGPT and its Friends in HuggingFace.
- EasyEdit (opens in a new tab) - An easy-to-use framework to edit large language models.
- chatgpt-shroud (opens in a new tab) - A Chrome extension for OpenAI's ChatGPT, enhancing user privacy by enabling easy hiding and unhiding of chat history. Ideal for privacy during screen shares.
Other Papers
If you're interested in the field of LLM, you may find the above list of milestone papers helpful to explore its history and state-of-the-art. However, each direction of LLM offers a unique set of insights and contributions, which are essential to understanding the field as a whole. For a detailed list of papers in various subfields, please refer to the following link (it is possible that there are overlaps between different subfields):
-
Analyse different LLMs in different fields with respect to different abilities
-
Hardware and software acceleration for LLM training and inference
-
Use LLM to do some really cool stuff
-
Augment LLM in different aspects including faithfulness, expressiveness, domain-specific knowledge etc.
-
Detect LLM-generated text from texts written by humans
-
Align LLM with Human Preference
-
Chain of thought—a series of intermediate reasoning steps—significantly improves the ability of large language models to perform complex reasoning.
-
Large language models (LLMs) demonstrate an in-context learning (ICL) ability, that is, learning from a few examples in the context.
-
A Good Prompt is Worth 1,000 Words
-
Finetune a language model on a collection of tasks described via instructions