NER (Named Entity Recognition)

What is NER?

Named Entity Recognition (NER) is a natural language processing (NLP) technique that identifies and classifies named entities in text into predefined categories like person names, organizations, locations, dates, monetary values, and more. It’s how computers learn to “read” and understand text the way humans do.

NER extracts named entities from unstructured text. Turning words into knowable facts. When you can extract “Apple” as an organization vs. “apple” as a fruit, you’ve got NER working for you. This is essential for building knowledge graphs and intelligent search.

NER Categories

Category	Examples
PERSON	“Elon Musk”, “Sundar Pichai”
ORG	“Google”, “Reliance Industries”
LOC	“Mumbai”, “Silicon Valley”
DATE	“January 15, 2024”, “Q3 2024”
MONEY	“$1.2 billion”, “₹50,000 crore”
PRODUCT	“iPhone 15”, “ChatGPT”

NER Implementation

1. Using spaCy (Production-Ready)

import spacy

nlp = spacy.load("en_core_web_lg")

text = """Google CEO Sundar Pichai announced 
          a $1 billion investment in India."""

doc = nlp(text)

for ent in doc.ents:
    print(f"{ent.text:20} → {ent.label_}")

# Output:
# Google             → ORG
# Sundar Pichai      → PERSON
# $1 billion         → MONEY
# India              → GPE

2. Using Hugging Face Transformers (State-of-the-Art)

from transformers import pipeline

ner = pipeline("ner", model="dslim/bert-base-NER")

text = "Apple Inc. is headquartered in Cupertino, California."
entities = ner(text)

for e in entities:
    print(f"{e['word']:15} → {e['entity']} ({e['score']:.2f})")

Custom NER for Business Data

# Fine-tuned NER for product mentions
from transformers import AutoTokenizer, AutoModelForTokenClassification

model_name = "your-org/product-ner-v1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)

# Extract product mentions from reviews
def extract_products(review_text):
    inputs = tokenizer(review_text, return_tensors="pt", truncation=True)
    outputs = model(**inputs)
    return decode_entities(outputs, inputs)

Use cases for scraped data:

Extract company names from news articles
Identify locations for geographic analysis
Find product mentions in social media
Build databases of people, companies, and connections

NER (Named Entity Recognition)

What is NER?

NER Categories

NER Implementation

Custom NER for Business Data

Used in Our Services

Related Terms

Need This at Scale?

Share This Term

Got Questions?