Back to Products & Projects
AI ProductsComplianceContent Strategy
Drug Safety Classification System (RAG Pipeline)
- •Identified critical accuracy gaps in Tata 1MG's drug safety classification system.
- •Led the design of a retrieval augmented generation (RAG) pipeline to automate classification of drugs across six safety advisory categories: pregnancy risk, breastfeeding safety, driving risk, alcohol risk, liver impairment, and kidney impairment.
The Challenge
- Manual classification against official drug labels was producing 33% accuracy, meaning one in three drugs had an incomplete/incorrect safety advisory visible to users.
- In a clinical context, an incorrect pregnancy or breastfeeding classification is a patient safety risk, not merely a content quality issue.
- Scale: thousands of drugs required classification across six advisory dimensions, making manual correction unsustainable.
The Approach
- Scoped and co-designed a RAG pipeline that retrieves the relevant section from the official drug label (SmPC), compares it against Tata 1MG's internal classification, and maps the correct advisory.
- Defined the evaluation rubric and accuracy benchmarking framework for the pipeline's output validation.
- Collaborated with engineering and medical affairs to establish the confidence threshold for automated vs human-reviewed classifications.
- Established the architecture for extension to side effects classification as a next phase.
Results
Drug safety classification accuracy improved from 33% to 91% post-pipeline deployment.
Six safety advisory categories covered: pregnancy, breastfeeding, hepatic impairment, renal impairment, driving and alcohol risk.
Pipeline architecture extended to side effects classification.
Tech Stack
RAG (Retrieval-Augmented Generation)PythonLLMChatGPTCMSConfluence