Podcasting recording studio
Victoria Buboc

Victoria Buboc

October 26, 2023

We launched v4 of Vatis Tech

The new version of Vatis Tech comes with a 95.89% average accuracy and +29.51% accuracy improvement.

At Vatis, we understand the importance of accurate and efficient speech recognition technology in today's fast-paced world. Version 4 of Vatis made a major upgrade from our previous version, Version 3, and we're excited to share the results of our latest studies and tests with you.

However, it's important to note that the accuracy of speech recognition systems can vary based on several factors such as the quality of the audio, the accent and speaking style of the speaker, background noise, and the vocabulary used. With this in mind, we used in our evaluation variations of audio samples from 29 datasets in 7 key domains:

  • Call center,
  • Medical,
  • Legal,
  • Meeting,
  • Ads,
  • Business,
  • Media.

Each of these domains has a unique set of requirements for speech recognition technology, and Vatis v4 has been designed to meet these needs with the best possible performance.

Tests were made using:

  • 29 Datasets
  • 7 Domaine
  • 38 937 Samples

Accuracy in Key Domains:

1. Phone call

Whether you're an individual that needs to transcribe a phone call or a big call center with hundreds of calls per day and a strong necessity to analyze data, automated speech to text technology is the best solution for you.

With advancements in our model, the accuracy of voice recognition has improved in V4 to 95.00% in the phone calls domain, making it a reliable solution for a wide range of applications.

Using a voice to text solution in the call center industry it's possible to:

  • increase the call center productivity
  • greatly impact long-term performance
  • reduce operating costs

In this study, we used a dataset that included phone calls from a call center having both agent and client audio in the Romanian language. The result showed a considerably better accuracy of Vatis solution for transcription audio to text compared to the previous version.

The Word Error Rate (WER) improvement compared to V3 is +64.96%.

2. Medical:

An audio to text converter can play a crucial role in the medical industry by streamlining the process of clinical documentation. With improved accuracy in version 4, Vatis ensures that critical medical information is captured correctly, reducing the risk of medical errors.

Vatis demonstrates a 94.41% accuracy for a dataset including diagnosis recordings from doctors and assistants.

The WER improvement compared to V3 is +79.82%

3. Legal:

Speed and accuracy are critical in tribunal courts as they can have a significant impact on the outcome of a case. Manual transcription can be slow and prone to errors, making it difficult to provide timely and accurate information. Incomplete or incorrect transcription can also result in a lack of trust in the accuracy of the transcript, which can compromise the integrity of the proceedings.

In such scenarios, automated speech recognition technology can provide a solution by transcribing large volumes of audio recordings in a short amount of time and with high accuracy.

Version 4 of Vatis has improved accuracy in legal transcripts, getting 90.65% for a dataset with poor quality audio collected from Romanian tribunal courts using standard microphones.

WER improvement in v4 +43.75%

4. Meeting:

Imagine having an accurate, good-quality transcript of every meeting. Sounds good, right?

To test a real-world situation, we used a dataset including online meeting recordings using laptops' microphones, so it was noisy, poor-quality audio. The meetings contain technical and business terms.

WER improvement in V4 +29.98%

5. ADS:

Converting audio to text can help businesses in the advertising industry or a media monitoring company by automating the process of transcribing audio content. With improved accuracy in version 4, Vatis makes it easier for businesses to analyze and understand audio data, leading to more effective advertising strategies.

Our dataset contains short audio from TV stations with brand names. This time, it was high-quality audio with background music.

WER improvement in V4 +68.33%

6. Business:

Automated speech to text technology is becoming increasingly popular among businesses for transcribing audio data from calls, meetings, and conferences. The technology allows companies to efficiently capture audio data and translate voice to text, providing a quick summary of the content.

This can be especially useful for businesses as it can help them to save time, reduce errors, and gain insights from their audio data more effectively.

With improved accuracy in version 4, Vatis makes it easier for businesses to analyze and understand audio data, leading to more informed decisions.

Dataset info: Good-quality audios containing business terms.

WER improvement in V4 +30.32%

7. Media:

The application of speech to text solutions in the media industry has grown significantly in recent years, as media organizations aim to improve the quality of their content by transcribing audio and video to text. This allows them to recognize and understand human speech and extract insights from their audio data more effectively.

Our dataset for the media industry included good-quality audio from Romanian televisions and radios.

WER improvement in V4 +22.20%

So, let's make a recap: Vatis Tech's latest version 4 has demonstrated significant improvements in accuracy and word error rate and showed the best accuracy rate for the Romanian language for each of the 7 domains described in this article with 95.89% average accuracy.

From the call center to the medical industry, voice recognition technology is essential for businesses looking to increase productivity and make informed decisions.

With Vatis, companies can access accurate speech recognition technology, helping them to unlock the full potential of their audio data.

Try Vatis for free and experience the power of online audio to text converter for yourself.

