Menu
Home
Forums
New posts
Search forums
What's new
Featured content
New posts
New media
New media comments
New resources
Latest activity
Media
New media
New comments
Search media
Resources
Latest reviews
Search resources
Misc
Log in
Register
What's new
Search
Search
Search titles only
By:
New posts
Search forums
Menu
Log in
Register
Install the app
Install
Home
Forums
Labrish
Nyuuz
Google and African researchers start the WAXAL speech dataset
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
[QUOTE="Queen, post: 86093, member: 27"] Voice tech finally stopped ghosting African languages after a massive open dataset tackled the data drought head-on. Why voice tech kept failing locally [LIST] [*]Many devices choke on African languages. [*]Over 2,000 tongues lack usable speech data. [*]Sub-Saharan users get locked out of convenience. [*]Data scarcity stayed the core blocker. [/LIST] What WAXAL brings to the table [LIST] [*]WAXAL rolled out as a large-scale speech dataset. [*]Named after the Wolof word for speak. [*]Covers 21 African languages. [*]Built to unlock inclusive voice systems. [/LIST] What the dataset actually contains [LIST] [*]Nearly two million recordings fuel the corpus. [*]Total audio clears 11,000 hours. [*]Roughly 1,250 hours are fully transcribed. [*]Studio speech supports text-to-speech work. [/LIST] Who built it together [LIST] [*]Makerere University gathered language data. [*]The University of Ghana supported 13 languages. [*]Digital Umuganda was led in five languages. [*]The African Institute for Mathematical Sciences added multilingual datasets. [/LIST] Studio and quality control work [LIST] [*]Media Trust helped produce clean voice recordings. [*]Loud n Clear handled professional audio capture. [*]Everyday speech balanced with studio voices. [*]Ethical collection stayed a priority. [/LIST] Data ownership and access rules [LIST] [*]Contributors keep rights to their data. [*]Researchers worldwide get open access. [*]Sharing does not erase local control. [*]Collaboration stays balanced. [/LIST] Why it matters long term [LIST] [*]Voice tools can finally serve local users. [*]Language preservation gains a digital backup. [*]AI systems learn from real speech. [*]The dataset is live under an open license. [/LIST] [/QUOTE]
Insert quotes…
Name
Post reply
Home
Forums
Labrish
Nyuuz
Google and African researchers start the WAXAL speech dataset
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.
Accept
Learn more…
Top