Back to Products & Projects
AI ProductsComplianceContent Strategy

Drug Safety Classification System (RAG Pipeline)

  • Identified critical accuracy gaps in Tata 1MG's drug safety classification system.
  • Led the design of a retrieval augmented generation (RAG) pipeline to automate classification of drugs across six safety advisory categories: pregnancy risk, breastfeeding safety, driving risk, alcohol risk, liver impairment, and kidney impairment.

The Challenge

  • Manual classification against official drug labels was producing 33% accuracy, meaning one in three drugs had an incomplete/incorrect safety advisory visible to users.
  • In a clinical context, an incorrect pregnancy or breastfeeding classification is a patient safety risk, not merely a content quality issue.
  • Scale: thousands of drugs required classification across six advisory dimensions, making manual correction unsustainable.

The Approach

  • Scoped and co-designed a RAG pipeline that retrieves the relevant section from the official drug label (SmPC), compares it against Tata 1MG's internal classification, and maps the correct advisory.
  • Defined the evaluation rubric and accuracy benchmarking framework for the pipeline's output validation.
  • Collaborated with engineering and medical affairs to establish the confidence threshold for automated vs human-reviewed classifications.
  • Established the architecture for extension to side effects classification as a next phase.

Results

Drug safety classification accuracy improved from 33% to 91% post-pipeline deployment.
Six safety advisory categories covered: pregnancy, breastfeeding, hepatic impairment, renal impairment, driving and alcohol risk.
Pipeline architecture extended to side effects classification.

Tech Stack

RAG (Retrieval-Augmented Generation)PythonLLMChatGPTCMSConfluence

Have a similar project?

Let's discuss how we can help bring your vision to life.

Start a Conversation