Building Knowledge Graphs: Traditional NER vs. LLMs
Knowledge graphs have become increasingly important for structuring information and enabling advanced querying and reasoning capabilities. But what's the best way to construct them? In this post, we'll compare traditional named entity recognition (NER) approaches with newer large language model (LLM) methods for knowledge graph construction.
Before we dive in, again, as our valued readers, if you have anything on your mind that we may be able to help, fill out this form below!👇
The Basics of Knowledge Graphs
A knowledge graph is a network of entities and their interconnections or relationships. They allow us to create structured, machine-readable information from unstructured text that can be used for reasoning and querying. Knowledge graphs power many applications including search engines, recommendation systems, and question answering.
Constructing a knowledge graph typically involves several steps:
Processing and cleaning text data
Extracting entities from the text
Identifying relationships between entities
Building the graph structure
Curating and refining the graph
Traditional NER Approaches
Traditional NER techniques have been around for decades and are a mature research area. They include:
Rule-based approaches
Machine learning models like conditional random fields
Deep learning models like BiLSTMs
Pre-trained models like spaCy and BERT
Advantages:
Very precise, especially for well-defined domains
Transparent and interpretable
Computationally efficient
Disadvantages:
Less scalable across diverse datasets
Harder to adapt to new contexts
Require ongoing maintenance and retraining
LLM-Based Approaches
With the rise of large language models, many are now using prompting techniques to extract entities and relationships.
Advantages:
General and adaptable across domains
Strong contextual understanding
Quick to set up with minimal fine-tuning
Disadvantages:
Resource intensive and computationally expensive
Less transparent ("black box")
Dependent on training data quality/coverage
Choosing the Right Approach
Traditional NER approaches tend to work best for:
Domain-specific applications (medical, legal, financial)
Use cases requiring very high precision
Situations where errors are costly
LLMs can be good for:
…
A Hybrid Approach
For many real-world applications, a hybrid approach combining traditional NER with LLMs may be ideal. In our sample implementation:
…
Curious to delve deeper into this?
Join Professor Mehdi and myself for a deep-dive discussion about:
which approach to choose
how to use hybrid approach, and
a step-by-step code walk through of a simple KG construction, in the video below!👇
Knowledge graphs are a powerful way to structure information, but choosing the right construction approach is key. By understanding the tradeoffs between traditional NER and LLM-based methods, you can build more effective, efficient, and reliable knowledge graph systems.
Stay tuned as we continue exploring the development of knowledge-augmented AI systems to extract maximum value from unstructured data sources!
🛠️✨ Happy practicing and happy building! 🚀🌟
Thanks for reading our newsletter. You can follow us here: Angelina Linkedin or Twitter and Mehdi Linkedin or Twitter.
Source of images/quotes:
📽️ Our other KG + RAG videos:
📚 Also if you'd like to learn more about RAG systems, check out our book on the RAG system:
📬 Don't miss out on the latest updates - Subscribe to our newsletter: