Tagging Use Cases (TAG)
Module Purpose: Extract and manage tags for documents using NER, keyword extraction, and custom taxonomies. This module contains 8 use cases.
Use Case Quick Reference
Overview
| Field |
Value |
| ID |
TAG-001 |
| Title |
Extract Keywords |
| Actor |
Tagging Service |
| Priority |
P1 (MVP Phase 3) |
Description
Extract important keywords and phrases from document text using statistical methods.
Methods
| Method |
Tool |
Best For |
| TF-IDF |
scikit-learn |
Corpus-level importance |
| YAKE |
yake |
Unsupervised, fast |
| KeyBERT |
keybert |
Semantic keywords |
Steps
- Retrieve extracted text
- Run keyword extraction algorithm
- Filter by score threshold
- Deduplicate similar keywords
- Return top N keywords
Output
{
"document_id": "doc_abc",
"keywords": [
{"keyword": "payment terms", "score": 0.85},
{"keyword": "invoice", "score": 0.82},
{"keyword": "net 30", "score": 0.78},
{"keyword": "accounts payable", "score": 0.72}
]
}
Acceptance Criteria
UC-TAG-002: Run Named Entity Recognition
Overview
| Field |
Value |
| ID |
TAG-002 |
| Title |
Run Named Entity Recognition |
| Actor |
Tagging Service |
| Priority |
P1 (MVP Phase 3) |
Description
Extract structured entities (names, orgs, dates, amounts) from document text.
Entity Types
| Entity |
Label |
Examples |
| Person |
PERSON |
"John Smith", "Dr. Patel" |
| Organization |
ORG |
"Acme Corp", "HDFC Bank" |
| Date |
DATE |
"January 15, 2024", "Q1 2024" |
| Money |
MONEY |
"$5,000", "₹50,000" |
| Location |
GPE |
"Mumbai", "New York" |
| Email |
EMAIL |
"john@acme.com" |
| Phone |
PHONE |
"+1-555-1234" |
Output
{
"document_id": "doc_abc",
"entities": [
{"text": "Acme Corporation", "label": "ORG", "start": 45, "end": 61},
{"text": "January 15, 2024", "label": "DATE", "start": 120, "end": 136},
{"text": "$5,000.00", "label": "MONEY", "start": 200, "end": 209}
]
}
Acceptance Criteria
Overview
| Field |
Value |
| ID |
TAG-003 |
| Title |
Apply Auto-Tags |
| Actor |
System |
| Priority |
P1 (MVP Phase 3) |
Description
Automatically apply tags to documents based on extracted keywords, entities, and classification.
Tag Sources
| Source |
Example Tags |
| Document Type |
type:invoice, type:contract |
| Category |
category:finance, category:legal |
| Keywords |
payment, agreement, quarterly |
| Entities |
org:Acme Corp, person:John Smith |
| Rules |
client:ABC (based on patterns) |
Steps
- Collect keywords from TAG-001
- Collect entities from TAG-002
- Apply classification tags from CLS-002/003
- Match against custom rules
- Deduplicate and normalize tags
- Store document-tag associations
Output
{
"document_id": "doc_abc",
"auto_tags": [
{"tag": "type:invoice", "source": "classification", "confidence": 0.94},
{"tag": "org:Acme Corp", "source": "ner", "confidence": 0.92},
{"tag": "payment", "source": "keyword", "confidence": 0.85}
]
}
Acceptance Criteria
Overview
| Field |
Value |
| ID |
TAG-004 |
| Title |
Add Manual Tags |
| Actor |
User |
| Priority |
P2 |
Description
Allow users to manually add tags to documents.
Steps
- User selects document(s)
- User enters tag(s) or selects from suggestions
- System validates tag format
- Tags are added with source="manual"
- Audit log entry created
{
"document_id": "doc_abc",
"tags": ["urgent", "client:XYZ", "project:2024-Q1"]
}
Acceptance Criteria
Overview
| Field |
Value |
| ID |
TAG-005 |
| Title |
Remove Tags |
| Actor |
User |
| Priority |
P2 |
Description
Allow users to remove tags from documents.
Acceptance Criteria
UC-TAG-006: Create Custom Tag
Overview
| Field |
Value |
| ID |
TAG-006 |
| Title |
Create Custom Tag |
| Actor |
Admin |
| Priority |
P2 |
Description
Define new tags in the system taxonomy.
Tag Definition
| Field |
Description |
| name |
Tag identifier (lowercase, no spaces) |
| display_name |
Human-readable name |
| category |
Grouping category |
| color |
Display color (hex) |
| description |
Usage description |
Output
{
"tag": {
"id": "tag_123",
"name": "priority:high",
"display_name": "High Priority",
"category": "priority",
"color": "#FF0000"
}
}
Acceptance Criteria
UC-TAG-007: Manage Tag Taxonomy
Overview
| Field |
Value |
| ID |
TAG-007 |
| Title |
Manage Tag Taxonomy |
| Actor |
Admin |
| Priority |
P3 |
Description
Organize tags into hierarchical categories and manage the tag structure.
Taxonomy Structure
tags/
├── type/
│ ├── invoice
│ ├── contract
│ └── report
├── category/
│ ├── finance
│ ├── legal
│ └── hr
├── org/
│ └── (dynamic entities)
└── custom/
└── (user-defined)
Acceptance Criteria
Overview
| Field |
Value |
| ID |
TAG-008 |
| Title |
Suggest Tags Based on Similar Docs |
| Actor |
System |
| Priority |
P3 |
Description
Suggest tags based on tags applied to similar documents.
Steps
- Find similar documents (via embeddings)
- Collect tags from similar documents
- Rank by frequency and similarity
- Return as suggestions
Output
{
"document_id": "doc_abc",
"suggested_tags": [
{"tag": "quarterly-report", "similar_docs": 15, "confidence": 0.78},
{"tag": "finance", "similar_docs": 23, "confidence": 0.85}
]
}
Acceptance Criteria
← Back to Use Cases | Previous: Classification | Next: OCR Processing →