Medical NER Pipeline - Processing Pipeline Flow

mindmap
  root((5-Stage Processing Pipeline))
    Stage 1: Base NLP
      spaCy Processing
        Tokenization
        Sentence Segmentation
        POS Tagging
        Dependency Parsing
      scispaCy Enhancement
        Medical Vocabulary
        Scientific Terms
    Stage 2: Entity Extraction
      BioBERT Models
        Disease Extraction
          BC5CDR-disease
          Confidence Scoring
        Chemical Extraction
          BC5CDR-chem
          Drug Detection
        Gene Extraction
          BC5CDR-gene
          Protein Names
      Template Boosting
        57,476 Medical Terms
        Exact Match
        Fuzzy Match
      Hybrid Fusion
        Confidence Weighting
        Deduplication
    Stage 3: Context Classification
      Context Types
        Confirmed
          138 Patterns
          HAS, DIAGNOSED WITH
        Negated
          99 Patterns
          DENIES, NO EVIDENCE
        Uncertain
          48 Patterns
          POSSIBLE, SUSPECTED
        Historical
          82 Patterns
          HISTORY OF, PREVIOUS
        Family
          79 Patterns
          FAMILY HISTORY, MOTHER HAS
      Scope Reversal
        103 Patterns
        BUT, HOWEVER, YET
    Stage 4: Section Detection
      Clinical Sections
        Chief Complaint
        HPI
        PMH
        Medications
        Assessment
        Plan
    Stage 5: Output Generation
      Excel Output
        43 Columns
        Entity Details
        Context Info
      Streamlit UI
        Interactive Display
        Entity Highlighting
      JSON Export
        Structured Data