Claudia Ancuta

Claudia Ancuta

October 31, 2023

A Comprehensive Guide to Captions

Chapter 1: Introduction to Captions

What Are Captions? Captions are text representations of the spoken word and essential sounds in a video or audio content. They serve to make the content accessible to individuals who are deaf or hard of hearing and they can also be helpful for people who are watching videos in a noisy environment or who are learning a new language.

Unlike subtitles, which primarily cater to audiences who don't understand the language in a video, captions cater to audiences who can't hear or properly discern the audio.

Use Case: Watching a news clip in a noisy environment without headphones. Captions would allow one to understand the content without needing to hear it.

Chapter 2: Quality of Captions

What Constitutes High-Quality Captions? Quality in captions involves:

  • Accuracy: Captions should precisely reflect the spoken words and convey essential sounds.
  • Synchronisation: They should align with the audio, not lagging or racing ahead.
  • Completeness: Captions should run from the beginning to the end of the content.
  • Readability: Captions should be easily readable, using clear fonts and appropriate sizes.

Use Case: In an educational video, high-quality captions ensure that all students, including those with hearing impairments, get accurate and timely information.

Chapter 3: Creating and Adding Captions

How to Add Captions? There are different ways to add captions:

  • Manual Entry: Using video editing software to manually type in captions synchronized with the audio.
  • Automated Software: Tools that automatically transcribe and sync the audio with text.
  • Professional Services: Hiring professionals who specialize in creating accurate captions.

Use Case: For a corporate webinar, using a combination of automated software transcription and manual adjustments for accuracy would be a practical approach.

Chapter 4: Formats and Types of Captions

Various file formats support captions:

  • SRT (SubRip Subtitle): One of the most common caption formats, compatible with most platforms.
  • VTT (Web Video Text Tracks): Common for web-based videos.
  • SCC (Scenarist Closed Captions): Used for broadcast media.

Types of Captions

  • Open Captions: Captions that are permanently part of the video.
  • Closed Captions: Captions that viewers can enable or disable.

Chapter 5: Services and Pricing


  • Automated services: the best automated systems can achieve accuracy rates up to 90-95% under optimal conditions (clear audio, standard accents, no background noise). Variances: accuracy can drop significantly with background noise, multiple speakers talking over each other, heavy accents, or specialized terminology.
  • Professional services: experts skilled in crafting precise captions.Typically charged per minute of content. Compared to automated solutions, professional services are more expensive, due to the human labor involved.


  • Per Minute/Hour: many services charge per minute or hour of audio/video processed. This can range from as low as $0.01 per minute to several dollars per minute, depending on the service.
  • Subscription Plans: some platforms offer monthly or yearly subscription plans with a set number of hours included.

Chapter 6: Benefits of Captions

  • Accessibility for the Deaf and Hard of Hearing People: captions provide a text version of the audio content, making videos and other audio-visual content accessible to those people who are deaf or hard of hearing.
  • Compliance with Regulations: many jurisdictions require captions for certain types of content, especially public broadcasts and educational materials, to ensure they are accessible to everyone.
  • Enhanced Learning and Comprehension: especially in educational settings, captions can aid in understanding and retaining information. They can be particularly helpful for complex topics or when unfamiliar jargon is used.
  • Multitasking and Noisy Environments: in places with a lot of background noise, such as gyms or cafes, or even in quiet settings like offices where people might not want to disturb others, captions allow viewers to follow along without needing sound.
  • Language Learning: for people learning a new language, captions in that language can help improve listening comprehension and vocabulary acquisition.
  • SEO Benefits: videos with captions are more likely to be picked up by search engines because the content is indexed. This can drive more traffic to websites or platforms hosting the videos.
  • Wider Audience Reach: captions can expand the audience of a video. Not everyone watches videos with sound, especially on social media platforms where videos might autoplay on mute.
  • Clarification: if the audio quality is poor, or if speakers have strong accents that some viewers might find difficult to understand, captions can help clarify the spoken content.
  • Improved Engagement: studies have shown that videos with captions often have better engagement metrics, including longer watch times.
  • Flexibility for Viewers: some people simply prefer reading captions while listening, especially if they are visual learners or if they're watching content in a setting where they can't use headphones.

Chapter 7: Captions vs. Subtitles

  • Captions: Primarily for those with hearing impairments, includes all relevant audio cues.
  • Subtitles: Translations for audiences who speak a different language. Doesn't usually include non-dialogue audio cues.

Chapter 8: Legal Aspects of Captions

Various laws mandate captions to ensure accessibility:

In the European Union, several legislative instruments highlight the importance of captions (often referred to as "subtitles" in the context of language translation) to ensure accessibility:

  • The European Accessibility Act (EAA) of 2019 promotes an accessible internal market, ensuring services like TV have captions. 
  • The 2016 Web Accessibility Directive mandates EU public sectors to make web content, including videos, accessible per the WCAG guidelines.
  • The 2018 Audiovisual Media Services Directive (AVMSD) aims for uniformity in audiovisual media while advocating for disability accessibility. 
  • The 2018 European Electronic Communications Code (EECC) emphasizes equal access for all, especially the disabled. 

EU guidelines set the framework, but member states determine specific regulations. Entities in Europe should adhere to both EU directives and country-specific laws.

United States:

  • ADA (1990): Prohibits discrimination against individuals with disabilities; mandates effective communication methods like captioning.
  • CVAA (2010): Modernizes federal communication law, requires captioning for certain online video content.
  • Rehabilitation Act (1973), Section 508: Federal agencies must ensure electronic and information technology is accessible, including video captions.
  • FCC: Regulates caption quality and inclusion on TV.

Other Regions and Countries:

  • Australia: Broadcasting Services Act 1992 mandates captioning for TV.
  • Canada: Broadcasting Act ensures TV programming accessibility with CRTC regulating captioning.
  • India: Rights of Persons with Disabilities Act, 2016 pushes for accessible TV programs.
  • UK: Communications Act 2003 requires a set percentage of TV programs to have captions.

The legal mandates underscore the importance of inclusive communication and equal access.

Chapter 9: Conclusion

Captions are essential for making audio-visual content accessible to all, including those with hearing impairments. Their inclusion is not just a matter of legal compliance but also a sign of inclusivity and empathy towards a diverse audience. Whether you're a content creator, educator, or marketer, understanding and implementing high-quality captions can only enhance the reach and impact of your content.

This guide provides an overview of captions, but the world of accessibility is vast and ever-evolving. It's always good to stay updated with the latest trends and regulations to ensure everyone can access and enjoy your content.

If you need software to automatically transcribe your audio or video to text, you can try for free our captioning solution. The platform is easy to use and every trial account comes with 60 minutes of free transcription, without any commitment. No credit card needed. 

  1. Drag and drop your video.
  2. Choose the language you want for transcription.
  3. Choose speaker identification if your video has multiple speakers.
  4. Check your transcript
  5. Edit your transcript inside the platform. Make sure to proofread your captions carefully to ensure that they are accurate and easy to read.
  6. Fine-tune timestamps for greater accuracy. 
  7. Download your transcript in SRT format. 

Check out our video tutorial for step-by-step guidance: How to use speech-to-text for publishing media

Upload these subtitle files to YouTube, Facebook, Vimeo, CupCut, and other video players to make your videos instantly accessible with captions. 

Curious about adding captions to an Instagram Reel, take a look at this article:

Continue Reading

Experience the Future of Speech Recognition Today

Try Vatis now, no credit card required.

Waveform visual