The MLnotes Newsletter

The MLnotes Newsletter

How to add new tokens to a transformer model vocabulary

Mehdi Allahyari's avatar
Mehdi Allahyari
Jun 15, 2022
∙ Paid
2
Share

In this post, we will see how to expand the vocabulary of a transformers model by adding your own words or tokens.

Why do you need to expand the vocabulary?

All the language models that are trained for a specific task in NLP domain have a vocabulary. The vocabulary is the unique words of the text corpus that the model has been trained with. Therefore, dep…

Keep reading with a 7-day free trial

Subscribe to The MLnotes Newsletter to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 MLnotes
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture