Independent Office Meeting Minutes Text Extraction Dataset

#Text Classification #Information Extraction #Natural Language Processing #Corporate Meeting Minutes #Administrative Document Management #Automated Report Generation
  • 500 records
  • 1.3G
  • TXT
  • CC-BY-NC-SA 4.0
  • MOBIUSI INCMOBIUSI INC
Updated:2026-02-04

AI Analysis & Value Prop

In modern enterprises, meeting minutes are a crucial component of daily operations. However, extracting key content poses challenges due to the richness of information and the need for structured data. Existing solutions largely rely on manual review, which is inefficient and prone to errors. This dataset is constructed to support automated text information extraction, enabling efficient information classification and management. Data collection involved collaboration with multiple enterprises, using transcription devices and software to capture real meeting minutes in natural office environments. During data annotation, multiple rounds of annotation and consistency checks were conducted, and a team of experienced linguists reviewed the data to ensure high accuracy and reliability. The annotation process involved more than 5 experts with a linguistic background to maintain the dataset's professional level. Data preprocessing involved text cleaning, syntactic analysis, and semantic annotation, utilizing cutting-edge natural language processing techniques to enhance model training effectiveness. Data is stored in a uniform TXT format, with entries classified using embedded labels, facilitating easy access to information and model training.

Dataset Insights

Sample Examples

Technical Specifications

FieldTypeDescription
file_namestringFile name
meeting_topicstringThe main topic or agenda mentioned in the meeting record.
participantsstringThe list of attendees in the meeting.
key_decisionsstringImportant decisions or conclusions reached during the meeting.
action_itemsstringTo-dos or follow-up tasks assigned from the meeting.
meeting_locationstringThe specific location where the meeting took place.
meeting_durationstringThe total duration of the meeting.
meeting_summarystringA brief overview of the meeting content.
meeting_moderatorstringThe person who was responsible for moderating the meeting.
related_documentsstringDocuments or attachments related to the meeting content.

Compliance Statement

Authorization TypeCC-BY-NC-SA 4.0 (Attribution–NonCommercial–ShareAlike)
Commercial UseRequires exclusive subscription or authorization contract (monthly or per-invocation charging)
Privacy and AnonymizationNo PII, no real company names, simulated scenarios follow industry standards
Compliance SystemCompliant with China's Data Security Law / EU GDPR / supports enterprise data access logs

Frequently Asked Questions

What types of text data does this dataset mainly include?
This dataset mainly includes text data from day-to-day meeting records of independent offices.
In which scenarios can this dataset be used?
This dataset can be used to improve extraction and classification of meeting record texts and is suitable for text analysis applications in everyday office scenarios.
What are the main benefits of using this dataset?
The main benefits of using this dataset include improved automation of independent office meeting records and enhanced accuracy in text classification and information extraction.
How can this dataset be used for text classification research?
Researchers can use the meeting records provided in the dataset to train models and develop more accurate text classification algorithms.
What is the significance of this dataset for natural language processing?
For NLP researchers, this dataset aids in developing and testing new information extraction and classification models, optimizing natural language text analysis techniques.

Can't find the data you need?

Post a request and let data providers reach out to you.

Get this Dataset

Verified for Enterprise Use

Cite this Work

@dataset{Mobiusi2026,
  title={Independent Office Meeting Minutes Text Extraction Dataset},
  author={MOBIUSI INC},
  year={2026},
  url={https://www.mobiusi.com/datasets/27f0cf4da25a439cdea933b34fccf7be?cate=3},
  urldate={2026-02-04},
  keywords={Meeting Minutes Dataset, Text Extraction, Text Classification, Information Management, NLP Dataset},
  version={1.0}
}

Using this in research? Please cite us.

placeholder
placeholder
placeholder
placeholder
placeholder
placeholder
placeholder

Popular Dataset Searches