Data Science Interview Challenge
Welcome to today's data science interview challenge! Today’s challenge is inspired by Jeremy Howard’s recent talk about LLMs! They are open-ended and all conversational pieces! Relax!
Here you go:
Question 1: What LLM leaderboard(s) do you use to compare performance, or choose models?
Question 2: What’s the modern way of doing language model fine-tuning these days?
Here are some tips for readers' reference:
Question 1:
There are various benchmarks and leaderboards available in the field of LLMs. The purpose of this question is to start a discussion on how you utilize LLMs in your work or research. Of course, not all LLM engineers or data scientists rely on leaderboards, but it is good to understand what are available for use for your specific use cases.
If you are interested in learning more about selecting a base model, you can refer to one of our previous posts on this topic.
How to Choose Base Model for Your LLM Application 🧐?
The field of Large Language Models (LLMs) is flourishing with numerous models, continuously evolving day-by-day. If you want to develop an LLM application for production, which model on the market should you choose? Should we prioritize the best-performing model on the market? GPT-4 undoubtedly stands out as a top contender.
Huggingface leaderboard is no doubt one of the most popular ones out there.
📐 The 🤗 Open LLM Leaderboard aims to track, rank and evaluate open LLMs and chatbots.
🤗 Submit a model for automated evaluation on the 🤗 GPU cluster on the "Submit" page! The leaderboard's backend runs the great Eleuther AI Language Model Evaluation Harness - read more details in the "About" page!
Jeremy recommended another one called fasteval. It has things like measurement on CoT (chain-of-thought) reasoning capabilities. What it means is that it “uses a set of questions (depending on the task) and prompts the model to first explain its reasoning step-by-step and then output the answer. The reasoning itself is currently ignored and only the final answer is checked for correctness. For another leaderboard that focuses more on this, see here.”
Question 2:
The modern way of doing language model fine-tuning is using something called
Keep reading with a 7-day free trial
Subscribe to The MLnotes Newsletter to keep reading this post and get 7 days of free access to the full post archives.