$\color{cyan}{1.\ Representation\ Learning\ (pretraining) }$

We need to represent language mathematically i.e. given a corpus you need to convert this corpus into its numerical form. This mathematical representation is called an embedding/context and the process is called representation learning. Why do this?? Because computers understand only numbers and not texts. We can do this in several ways:

Via Sentence Embedding
Via Word Embedding
Via Character Embedding
Via Subword (or WordPieces) (or Token) Embedding (everyone uses this)

So since we have so many methods to convert our corpus into numerical representation then which one should we use?? It really depends on the data that you are trying to convert

if it is social media data then maybe character embedding will work better
if your language is Chinese then maybe word embedding will work really really badly because in the Chinese language, there is no gap between words so identifying words is a big challenge hence in the Chinese language character embedding works really well and even subword embedding works really well
In languages like French and Arabic, though there are spaces but if say we have two words in English “so said” then it is just one word in these languages, hence word embedding might not work well with these languages

Finally before you start building your own LLLM from scratch, I would recommend reading this amazing blog from HF which help you understand how to find the right architecture and how are big scalable models built !!

$\color{cyan}{2.\ Downstream\ NLP\ (posttraining) }$

$\color{red}{2.A]\ Supervised\ Fine\ Tuning\ (SFT) }$

Non Generative (Classification) Tasks - Natural Language Understanding (NLU) Tasks
- Sentence level classification Tasks
- Token / Word level classification Tasks (also called Sequence Labeling/ Learning/ Tagging Task) : Word / Token level classification problem is a problem in which you classify each word / token in the corpus as something. It has been observed that a Masked Language Model based base model gives better results compared to the next subword prediction based base model on NLU tasks. There can be numerous word level classification tasks, some of those are
  - POS Tagging
  - Name Entity Recognision (NER) Tagging
  - Reading QA / Reading Comprehension
  - Token Level Languauge Identification : Classifying if a token / word is hindi or english
  - Chunking
  - Word Sense Disambiguation (WSD) / Semantic Tagging
  - Parsing
  - Discourse and Coreference Resolution
Generative Tasks / Natural Language Generation (NLG) Tasks / Text2Text Tasks / Sequence2Sequence (Seq2Seq or S2S) Tasks / Sequence2Sequence Learning : This is a family of tasks in which we generate new sentences using input sentences. It has been observed that a next subword prediction based foundation model gives better results compared to Masked Language Model based foundation model on NLG tasks. These include tasks like
- Machine Translation (MT)
- Text Summarization
- Paraphrasing : Architecture similar to what we used in MT can be used here, just change the dataset!!
- Machine Transliteration : Architecture similar to what we used in MT can be used here, just change the dataset!!
- Text2Code Generation
Instruction Tuning Task

$\color{red}{2.B]\ Reinforcement\ Learning\ Based\ Fine\ Tuning}$

Now above we saw how to finetune a foundational LLM model for different downstream tasks using SFT but for all those tasks we can also finetune a foundation model using RL. There are two ways to use RL to finetune :

Manual Reward Funtion Based RL : Here we will see how to finetune using RL if you 'can design' a good reward function for your downstream task.
(preferred)Automatic Reward Function Based RL : Here we will see how to finetune using RL if you 'cannot design' a good reward function for your downstream task, via the help of preference dataset. There are multiple methods to do this :
- Reinforcement Learning using Human Feedback (RLHF) : trains a seperate reward model
- (preferred) Direct Preference Optimization (DPO) : uses base LLM itself as reward model

Important

It has been proved previously that its better to first finetune LLM on any task using SFT and then finetune on the same task using RL, it gives better outcomes. This notebook from UnSloth follows the below recepie to convert Qwen3 from a non reasoning model to a reasoning model.

$\color{cyan}{3.\ Non\ Agentic\ LLM\ Systems\ --LLMs\ without\ tool\ access}$

Since llm doesn't have tool access, it is not capable of fetching data 'on its own' to answer the query.
Hence to answer the queries, it has two methods :
- A. use its own internal knowledge that it was trained on (this can lead to wrong answers cause data that it was trained on is now outdated)
- B. use the context that 'user provides' to the llm. If this context is huge then it becomes really important how your llm goes through this context, there are many ways, more details available here.
Once you have the right context, using efficient prompting techinques also becomes really important.
- Use dynamic few shots
- Use dynamic prompts

$\color{cyan}{4.\ LLM\ Agents\ --LLMs\ with\ tool\ access}$

Now since we have given our LLMs access to tools, it can use these tools to get real-time data. You can classify tools into following broad categories :

API tools (Structured Data) : These are simple tools over existing apis e.g. twitter tools which will return json format data
Web Tool (Unstructured Data) : Now there are ample of websites on the internet which do not provide an API to access their information hence to get information from such website we need to go to their website url and somehow get the data.
- Web Search Tool (for Static Web Pages) : getting data from such pages is simple, its like downloading the page and using it.
- Web Agent Tool (for Dynamic Web Pages) : getting data from such pages is difficult and if you use download and use method then you will miss information. To understand this better read this blog. Hence to solve this people built a web agent which can navigate the web just like how humans do.

There are many frameworks that you can use to build these LLM Systems, few good ones are DSPY || ⁠AutoGen || Langraph || CrewAI

$\color{red}{4.A]\ Single\ Agentic\ System}$

A single agentic system does not means that you do one single llm call, it just means that you just have 1 single agent but that agent can be called multiple times also.

$\color{red}{4.B]\ Multi\ Agentic\ System}$

When to use a multi-agentic system? When even after performing all the context optimization techniques that we saw for single agentic system, still we face context-rot issue. In such situations, idea is to use multiple agents with each having its own context and its own specific tools.

Static Workflow Agents
Dynamic Workflow Agents : these are useful for multi-hop queries (queries wherein you need answer to other smaller queries to find answer to this bigger query) wherein you need a dynamic planner

Name		Name	Last commit message	Last commit date
Latest commit History 621 Commits
Preprocessing		Preprocessing
Representation-Learning		Representation-Learning
dpo		dpo
imgs		imgs
unitask_downstream_nlp		unitask_downstream_nlp
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

$\color{cyan}{1.\ Representation\ Learning\ (pretraining) }$

$\color{cyan}{2.\ Downstream\ NLP\ (posttraining) }$

$\color{red}{2.A]\ Supervised\ Fine\ Tuning\ (SFT) }$

$\color{red}{2.B]\ Reinforcement\ Learning\ Based\ Fine\ Tuning}$

$\color{cyan}{3.\ Non\ Agentic\ LLM\ Systems\ --LLMs\ without\ tool\ access}$

$\color{cyan}{4.\ LLM\ Agents\ --LLMs\ with\ tool\ access}$

$\color{red}{4.A]\ Single\ Agentic\ System}$

$\color{red}{4.B]\ Multi\ Agentic\ System}$

About

Uh oh!

Releases

Packages

Languages

khetansarvesh/NLP

Folders and files

Latest commit

History

Repository files navigation

$\color{cyan}{1.\ Representation\ Learning\ (pretraining) }$

$\color{cyan}{2.\ Downstream\ NLP\ (posttraining) }$

$\color{red}{2.A]\ Supervised\ Fine\ Tuning\ (SFT) }$

$\color{red}{2.B]\ Reinforcement\ Learning\ Based\ Fine\ Tuning}$

$\color{cyan}{3.\ Non\ Agentic\ LLM\ Systems\ --LLMs\ without\ tool\ access}$

$\color{cyan}{4.\ LLM\ Agents\ --LLMs\ with\ tool\ access}$

$\color{red}{4.A]\ Single\ Agentic\ System}$

$\color{red}{4.B]\ Multi\ Agentic\ System}$

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages