Accompanied Communication Chinese Voice Dialogue Dataset

#Natural Language Processing #Speech Synthesis #Sentiment Analysis #Smart Assistants #Speech Synthesis #Social Robots
  • 500 hours
  • 1.5G
  • WAV
  • CC-BY-NC-SA 4.0
  • MOBIUSI INCMOBIUSI INC
Updated:2026-02-04

AI Analysis & Value Prop

In modern life, people increasingly rely on voice assistants and smart devices, which raises higher demands for the naturalness and fluency of everyday communication. However, existing TTS systems often struggle to express complex emotions and diverse contextual dialogues. Therefore, this dataset aims to provide high-quality accompanied communication voice recordings to help address these technical bottlenecks. Data collection is conducted using various types of microphones in different environments, including home, office, and outdoor scenarios, ensuring diversity and representativeness of the recordings. Quality control includes multiple rounds of annotation and consistency checks, handled by a team of over 50 experts in the fields of linguistics and voice processing. The data undergoes preprocessing such as noise reduction, alignment, and format conversion before being organized and stored in WAV format for easy retrieval and use.

Dataset Insights

Sample Examples

8a08a5a806b4f244040993e8c68a6fc6.wav

  • 8a08a5a806b4f244040993e8c68a6fc6.wav
    00:00

Technical Specifications

FieldTypeDescription
file_namestringFile name
durationstringDuration
audio_ratestringAudio sample rate
audio_channelstringAudio channel
speaker_idstringA unique identifier for the speaker.
speaker_genderstringGender information of the speaker, such as male or female.
speaker_ageintegerThe age of the speaker.
languagestringThe language used in the audio content, such as Chinese, English, etc.
accentstringAccent information of the speaker, such as American, British, etc.
emotional_tonestringThe emotional tone conveyed in the audio, such as happy, sad, etc.
background_noise_levelstringThe level of background noise during the recording, such as low, medium, high.
speech_ratefloatThe rate of speech in the audio, usually measured in words per second.
dialogue_turnintegerThe turn of the current segment within the dialogue.
transcription_accuratebooleanIndicates whether the transcription text is accurate.

Compliance Statement

Authorization TypeCC-BY-NC-SA 4.0 (Attribution–NonCommercial–ShareAlike)
Commercial UseRequires exclusive subscription or authorization contract (monthly or per-invocation charging)
Privacy and AnonymizationNo PII, no real company names, simulated scenarios follow industry standards
Compliance SystemCompliant with China's Data Security Law / EU GDPR / supports enterprise data access logs

Frequently Asked Questions

What is the Companion Communication Voice Dialogue Dataset?
The Companion Communication Voice Dialogue Dataset is an audio dataset specifically designed for TTS (Text-to-Speech) training, aiming to enhance models' speech generation capabilities in everyday communication.
What are the application scenarios for the Companion Communication Voice Dialogue Dataset?
The dataset is suitable for applications requiring the generation of everyday conversational speech, particularly useful for developing voice assistants, smart customer service, and other voice interaction systems.
What are the main features of the Companion Communication Voice Dialogue Dataset?
The main features of the dataset include covering a wide range of everyday conversational scenarios, with audio data specifically designed to enhance TTS models' speech generation capabilities.
How to use the Companion Communication Voice Dialogue Dataset to train TTS models?
When using the dataset to train TTS models, audio samples from the dataset can be used as input to train the model to generate natural and highly personalized conversational speech.
Why is the Companion Communication Voice Dialogue Dataset important?
The Companion Communication Voice Dialogue Dataset is important for developing more natural and reliable TTS models as it provides a realistic everyday conversational voice environment.

Can't find the data you need?

Post a request and let data providers reach out to you.

Get this Dataset

Verified for Enterprise Use

Cite this Work

@dataset{Mobiusi2026,
  title={Accompanied Communication Chinese Voice Dialogue Dataset},
  author={MOBIUSI INC},
  year={2026},
  url={https://www.mobiusi.com/datasets/e4500779e4d108d3ae88f6e208ac3451?dataset_task_cate_id=6},
  urldate={2026-02-04},
  keywords={TTS Dataset, Speech Synthesis, Audio Dialogue Dataset, Accompanied Communication},
  version={1.0}
}

Using this in research? Please cite us.

placeholder
placeholder
placeholder
placeholder
placeholder
placeholder
placeholder

Popular Dataset Searches