Medical NER Pipeline - Technology Stack
mindmap
root((Technology Stack))
NLP Foundation
spaCy 3.7+
Tokenization
POS Tagging
Dependency Parsing
NER
scispaCy
en_core_sci_sm
Medical Vocabulary
Scientific Terms
negspacy
Negation Detection
Scope Resolution
Machine Learning
BioBERT Models
Disease Model
dmis-lab/biobert
BC5CDR corpus
Chemical Model
Drug recognition
Gene Model
Protein detection
Hugging Face
Transformers
AutoTokenizer
Pipeline API
PyTorch
GPU Support
Tensor Operations
Template System
57,476 Terms
Diseases 42K+
Chemicals 5.2K
Genes 10.2K
Sources
ICD-10
SNOMED CT
RxNorm
LOINC
Matching
Exact Match
Word Boundaries
Data Processing
pandas
DataFrame Operations
Data Transformation
openpyxl
Excel Output
43 Columns
Formatting
Web Interface
Streamlit 1.28+
File Upload
Text Input
Real-time Processing
Export Options
Visualization
Entity Highlighting
Context Icons
Color Coding
Performance
Processing Speed
~0.9 rows/sec
Accuracy
Entity: 96%
Context: 93%