Restaurant Onsite Call and Reminder Chinese Speech Recognition Audio Dataset

#Speech recognition #natural language processing #speech-to-text #Intelligent customer service #voice assistants #automatic speech recognition
  • 500 hours
  • 1.6G
  • WAV
  • CC-BY-NC-SA 4.0
  • MOBIUSI INCMOBIUSI INC
Updated:2026-02-04

AI Analysis & Value Prop

In the retail e-commerce industry, with the increasing demand for personalized services and intelligent interaction, improving user experience through speech recognition technology has become a trend. However, existing speech recognition systems perform poorly in noisy restaurant environments, with low accuracy and inability to effectively recognize multiple languages or dialects. This dataset aims to address the challenges of speech recognition in complex dining environments, helping to improve the accuracy of automated systems. Data collection uses professional recording equipment, selecting restaurant scenarios at different times and conditions to ensure coverage of diverse noise backgrounds. Quality control includes multi-round manual annotation and consistency checks, with a team composed of trained language experts and data analysts. Preprocessing steps include noise reduction, audio segmentation, volume leveling, etc., with data categorized by scene and phoneme type stored in WAV format.

Dataset Insights

Sample Examples

295f2b875794a935ae6ba3de4c0c782f.wav

  • 295f2b875794a935ae6ba3de4c0c782f.wav
    00:00
  • 886d260f8d094f51b94ee51733c7331a.wav
    00:00

Technical Specifications

FieldTypeDescription
file_namestringFile name
durationstringDuration
audio_ratestringAudio sample rate
audio_channelstringAudio channel
speaker_genderstringThe gender of the speaker in the recording.
speaker_age_groupstringThe age group of the speaker in the recording, e.g., child, teenager, adult, senior.
accentstringThe accent of the speaker in the recording, e.g., Standard Mandarin, Cantonese, Beijing Accent, etc.
speech_emotionstringThe emotional state expressed by the speaker in the recording, e.g., happy, sad, angry, etc.
background_noise_levelstringThe level of background noise in the recording such as none, slight, or significant noise.
speech_ratefloatThe rate of speech by the speaker, expressed in words per minute (WPM).
claritystringThe clarity of the speech in the recording, e.g., very clear, clear, unclear.

Compliance Statement

Authorization TypeCC-BY-NC-SA 4.0 (Attribution–NonCommercial–ShareAlike)
Commercial UseRequires exclusive subscription or authorization contract (monthly or per-invocation charging)
Privacy and AnonymizationNo PII, no real company names, simulated scenarios follow industry standards
Compliance SystemCompliant with China's Data Security Law / EU GDPR / supports enterprise data access logs

Frequently Asked Questions

What application scenarios is this dataset suitable for?
This dataset is suitable for developing speech recognition applications in the healthcare field, such as automatic reminder systems and emergency call systems.
Does this dataset include audio in different languages or accents?
Whether the dataset includes audio in different languages or accents requires checking the dataset's detailed description.
How can this dataset be used to improve the performance of a speech recognition system?
The diverse speech samples in this dataset can be used to train models to improve recognition accuracy and robustness.
What healthcare scenarios are covered by the audio samples in this dataset?
The audio samples cover common call and reminder scenarios such as medication reminders and health check notifications.
When using this dataset for research, do privacy issues need to be considered?
Yes, researchers must adhere to relevant privacy protection and data compliance requirements during use.

Can't find the data you need?

Post a request and let data providers reach out to you.

Get this Dataset

Verified for Enterprise Use

Cite this Work

@dataset{Mobiusi2026,
  title={Restaurant Onsite Call and Reminder Chinese Speech Recognition Audio Dataset},
  author={MOBIUSI INC},
  year={2026},
  url={https://www.mobiusi.com/datasets/9c0f01ef821d67cd860cd4b329d0e87a?dataset_scene_cate_type=4},
  urldate={2026-02-04},
  keywords={Chinese speech recognition, retail e-commerce speech data, restaurant speech recognition},
  version={1.0}
}

Using this in research? Please cite us.

placeholder
placeholder
placeholder
placeholder
placeholder
placeholder
placeholder

Popular Dataset Searches