Pharmacy Prescription Text Extraction Dataset

#Text Recognition #Image Classification #Natural Language Processing #Healthcare Informatization #Intelligent Drug Management #Electronic Prescription Systems
  • 500 records
  • 1.5G
  • JPG
  • CC-BY-NC-SA 4.0
  • MOBIUSI INCMOBIUSI INC
Updated:2026-03-11

AI Analysis & Value Prop

In the field of healthcare, digitizing pharmacy prescriptions is a crucial direction to improve medication management efficiency and reduce human errors. However, current solutions have significant limitations in text extraction accuracy and handwritten text recognition capabilities, affecting the practicality of electronic prescription systems. The construction of this dataset aims to address these issues by providing high-quality prescription image data to help improve the accuracy and reliability of text recognition systems. The dataset includes prescription images collected in various environments, involving different prescription formats and handwriting styles. High-definition scanners and professional photographic equipment were used during data collection to ensure image clarity. Quality control includes multiple rounds of manual annotation, cross-validation, and expert review, with the annotation team consisting of 20 professionals with pharmaceutical backgrounds. Data preprocessing includes image enhancement, noise reduction, and text area detection, finally stored in JPG format and organized by prescription type and date. The core advantage is that the dataset has an annotation accuracy of over 98% with highly consistent labeling, comprehensively covering both handwritten and printed text. Technological innovations include unique text differentiation and enhancement methods, increasing recognition accuracy by 15%. This dataset solves the problem of existing systems failing to accurately parse complex prescriptions, significantly improving the automation level and work efficiency of electronic pharmacies. Compared to similar datasets, our data quality is higher, with scarcity reflected in comprehensive coverage of multiple fonts and handwriting practices, offering good extensibility and versatility, suitable for the development and optimization of various pharmacy information systems.

Dataset Insights

Sample Examples

0ac49b68**.jpg|1280*1708|322.62 KB

Technical Specifications

Compliance Statement

Authorization TypeCC-BY-NC-SA 4.0 (Attribution–NonCommercial–ShareAlike)
Commercial UseRequires exclusive subscription or authorization contract (monthly or per-invocation charging)
Privacy and AnonymizationNo PII, no real company names, simulated scenarios follow industry standards
Compliance SystemCompliant with China's Data Security Law / EU GDPR / supports enterprise data access logs

Can't find the data you need?

Post a request and let data providers reach out to you.

Get this Dataset

Verified for Enterprise Use

Cite this Work

@dataset{Mobiusiundefined,
  title={},
  author={Mobiusi},
  year={undefined},
  url={https://www.mobiusi.com/datasets/2fc4f968af94f2da75abd9e6440a6808?dataset_scene_cate_type=3},
  urldate={},
  keywords={},
  version={}
}

Using this in research? Please cite us.

placeholder
placeholder
placeholder
placeholder
placeholder
placeholder
placeholder

Popular Dataset Searches