The MLnotes Newsletter

The MLnotes Newsletter

Share this post

The MLnotes Newsletter
The MLnotes Newsletter
Data Science Interview Challenge

Data Science Interview Challenge

Angelina Yang's avatar
Angelina Yang
Sep 21, 2023
∙ Paid

Share this post

The MLnotes Newsletter
The MLnotes Newsletter
Data Science Interview Challenge
Share

Welcome to today's data science interview challenge! Today’s challenge is inspired by a Huggingface Transformer Lecture (2022 version) at Stanford! Relax!

A warm up question 🤓:

See if you can tell me (without writing down) what the code looks like that creates a torch.tensor with the following contents:

Now tell me what the code look like to compute the average of each row (.mean()) and each column. What's the shape of the results?

I usually don’t do live coding questions but this one is straightforward and you should be able to speak while thinking. Have fun!

Now back to the basics:

Question 1: What does the tokenizer do for a language model?

Question 2: The BERT model is a ground breaking model in the development of large language models. What does it look at? How to explain the model or the attention mechanism?

Source: Paper

Here are some tips for readers' reference:

Warm up Question :

Is the following what you are envisioning?

Question 1:

Pretrained models are implemented along with tokenizers that are used to preprocess their inputs. The tokenizers take raw strings or list of strings and output what are effectively dictionaries that contain the the model inputs.

Check the lecturer’s explanation below! (To jump to the answer scroll to roughly 3 minutes of the lecture.)

Keep reading with a 7-day free trial

Subscribe to The MLnotes Newsletter to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 MLnotes
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share