Friday, 16 May 2025
  • My Feed
  • My Interests
  • My Saves
  • History
  • Blog
Subscribe
Capernaum
  • Finance
    • Cryptocurrency
    • Stock Market
    • Real Estate
  • Lifestyle
    • Travel
    • Fashion
    • Cook
  • Technology
    • AI
    • Data Science
    • Machine Learning
  • Health
    HealthShow More
    Eating to Keep Ulcerative Colitis in Remission 
    Eating to Keep Ulcerative Colitis in Remission 

    Plant-based diets can be 98 percent effective in keeping ulcerative colitis patients…

    By capernaum
    Foods That Disrupt Our Microbiome
    Foods That Disrupt Our Microbiome

    Eating a diet filled with animal products can disrupt our microbiome faster…

    By capernaum
    Skincare as You Age Infographic
    Skincare as You Age Infographic

    When I dove into the scientific research for my book How Not…

    By capernaum
    Treating Fatty Liver Disease with Diet 
    Treating Fatty Liver Disease with Diet 

    What are the three sources of liver fat in fatty liver disease,…

    By capernaum
    Bird Flu: Emergence, Dangers, and Preventive Measures

    In the United States in January 2025 alone, approximately 20 million commercially-raised…

    By capernaum
  • Sport
  • 🔥
  • Cryptocurrency
  • Data Science
  • Travel
  • Real Estate
  • AI
  • Technology
  • Machine Learning
  • Stock Market
  • Finance
  • Fashion
Font ResizerAa
CapernaumCapernaum
  • My Saves
  • My Interests
  • My Feed
  • History
  • Travel
  • Health
  • Technology
Search
  • Pages
    • Home
    • Blog Index
    • Contact Us
    • Search Page
    • 404 Page
  • Personalized
    • My Feed
    • My Saves
    • My Interests
    • History
  • Categories
    • Technology
    • Travel
    • Health
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Home » Blog » Researchers from Dataocean AI and Tsinghua University Introduces Dolphin: A Multilingual Automatic Speech Recognition ASR Model Optimized for Eastern Languages and Dialects
AIMachine LearningTechnology

Researchers from Dataocean AI and Tsinghua University Introduces Dolphin: A Multilingual Automatic Speech Recognition ASR Model Optimized for Eastern Languages and Dialects

capernaum
Last updated: 2025-04-04 03:25
capernaum
Share
Researchers from Dataocean AI and Tsinghua University Introduces Dolphin: A Multilingual Automatic Speech Recognition ASR Model Optimized for Eastern Languages and Dialects
SHARE

Automatic speech recognition (ASR) technologies have advanced significantly, yet notable disparities remain in their ability to accurately recognize diverse languages. Prominent ASR systems, such as OpenAI’s Whisper, exhibit pronounced performance gaps when processing Eastern languages compared to Western counterparts. This discrepancy presents tangible challenges in multilingual regions, particularly those characterized by numerous dialects and linguistic variations, underscoring the necessity for sophisticated multilingual ASR systems tailored specifically to Eastern languages.

Researchers from Dataocean AI and Tsinghua University have introduced Dolphin, a comprehensive multilingual automatic speech recognition model built upon an extended Whisper architecture, optimized to accommodate a broader spectrum of Eastern languages and dialects. Dolphin effectively addresses key limitations identified in current multilingual ASR models by integrating both proprietary datasets and publicly accessible datasets. The model proficiently supports 40 Eastern languages from East Asia, South Asia, Southeast Asia, and the Middle East, as well as 22 distinct dialects of Chinese.

Dolphin employs a hybrid ASR approach combining Connectionist Temporal Classification (CTC) with attention-based mechanisms. Its architecture incorporates an E-Branchformer encoder and a Transformer decoder, substantially enhancing the model’s capability to interpret complex linguistic patterns across diverse languages. Dolphin also utilizes a dual-level language tokenization system, distinguishing general language codes from region-specific dialect tokens. This mechanism improves recognition accuracy and resolution, particularly for dialect-intensive languages such as Chinese. Additionally, Dolphin incorporates a 4× subsampling layer to efficiently reduce input sequence lengths, enhancing computational speed and training effectiveness without compromising recognition accuracy.

Experimental evaluations demonstrate Dolphin’s marked improvements in multilingual speech recognition accuracy relative to Whisper models. For instance, the Dolphin small model reduced the Word Error Rate (WER) by approximately 24.5% compared to the base model, with further incremental improvements observed in medium and large variants. Specifically, the Dolphin base model attained an average WER of 31.8%, notably outperforming Whisper’s large-v3 model, which recorded an average WER of 52.3% across the same evaluation benchmarks. Assessments conducted on dialect-focused datasets, including KeSpeech, confirmed Dolphin’s capability to consistently handle intricate linguistic variations, with performance enhancements correlating positively with increased model size.

The research team released the Dolphin base and small models publicly under the Apache 2.0 license, along with associated inference code. Dolphin’s training utilized an extensive dataset encompassing 21.2 million hours of audio recordings, incorporating 7.4 million hours derived from open datasets such as Common Voice, ReazonSpeech, and GigaSpeech2, thereby ensuring robustness and replicability.

In summary, Dolphin constitutes a significant advancement in multilingual ASR technology, systematically addressing prevailing limitations in Eastern language and dialect recognition through methodological data integration, refined architectural frameworks, and commitment to open-source dissemination. This work sets an influential benchmark for future developments in multilingual ASR research, advancing linguistic inclusivity and system generalization.


Check out the Paper, Dolphin-small-model and Dolphin-base-model. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 85k+ ML SubReddit.

🔥 [Register Now] miniCON Virtual Conference on OPEN SOURCE AI: FREE REGISTRATION + Certificate of Attendance + 3 Hour Short Event (April 12, 9 am- 12 pm PST) + Hands on Workshop [Sponsored]

The post Researchers from Dataocean AI and Tsinghua University Introduces Dolphin: A Multilingual Automatic Speech Recognition ASR Model Optimized for Eastern Languages and Dialects appeared first on MarkTechPost.

Share This Article
Twitter Email Copy Link Print
Previous Article This AI Paper Introduces FASTCURL: A Curriculum Reinforcement Learning Framework with Context Extension for Efficient Training of R1-like Reasoning Models This AI Paper Introduces FASTCURL: A Curriculum Reinforcement Learning Framework with Context Extension for Efficient Training of R1-like Reasoning Models
Next Article Cardano Price Can Clinch $1 As It Eyes Bounce From New Support Zone Cardano Price Can Clinch $1 As It Eyes Bounce From New Support Zone
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Using RSS feeds, we aggregate news from trusted sources to ensure real-time updates on the latest events and trends. Stay ahead with timely, curated information designed to keep you informed and engaged.
TwitterFollow
TelegramFollow
LinkedInFollow
- Advertisement -
Ad imageAd image

You Might Also Like

A Guide to Mastering Serverless Machine Learning

By capernaum
Windsurf has launched homegrown AI models despite OpenAI deal
Data Science

Windsurf has launched homegrown AI models despite OpenAI deal

By capernaum
Is TikTok’s new meditation push a real safety play or just good PR optics?
Data Science

Is TikTok’s new meditation push a real safety play or just good PR optics?

By capernaum

The surprising true story behind that viral Apple App Store payment warning

By capernaum
Capernaum
Facebook Twitter Youtube Rss Medium

Capernaum :  Your instant connection to breaking news & stories . Stay informed with real-time coverage across  AI ,Data Science , Finance, Fashion , Travel, Health. Your trusted source for 24/7 insights and updates.

© Capernaum 2024. All Rights Reserved.

CapernaumCapernaum
Welcome Back!

Sign in to your account

Lost your password?