Upload a new file
From your account files page, you need to press on the green "New file" button. After that, on the modal that pops up, you press on the "Press to upload" input. Now, you need to select the file you want to upload from your computer.
(Optionally) Upload a new file
If you want, there is an easier way to upload a file. Once again, you need to be in your account files page. In this page, you simply drag the file you want to upload, and you drop it. That's it!
Select the language
After you have succesfully selected the file you want to upload, by either the drag and drop method, or the button one, you will need to select the language of the file. We are going to guess it is in English, since this tutorial is about transcribing an English audio or video. For this, you need to press on the input below the "Select model for transcription" text. In the dropdown that will appear, choose Enlgish.
Select the model
We are constantly developing new models for both new and existing languages. Usually the model names are one or two descriptive words. For example "General", refers to model for common speech. "Legal" refers to a model that is strong on legal words, for lawyer's meetings, or trials. "Medical" is a model that is specialized on medical terms, for example medical exams, operating room, or even medical school courses. And so on. Also, a model might have v1, v2, v3, etc., associated with its name. That is the version of the model, usually the higher the number, the better the model.
Please note that not all models have these features.
- "Post-processing" switch:If enabled (i.e. the switch is green), this will automatically add punctuation (,.!?) and capitalization (the first letter of a sentence will be written with capital letter) to your transcript. It will add entities recognition (i.e. brand names, person names, national holidays, etc., will be written with capital letter). It will also add a numerals conversion layer model, which will try, based on context, to rewrite numbers from letters to digits (e.g. thirteen will become 13, or two will become 2). It will also disable the disfluencies (e.g. grunts or non-lexical utterances such as "huh", "uh", "erm", "um", "hmm", etc.).
- "Speakers Diarization" switch:If enabled (i.e. the switch is green), a speaker recognition model will be added to your transcript. This means, that the paragraphs of the resulting transcript, will be split based on which speaker is speaking at a given time.
- "Multiple Channels" switch:If enabled (i.e. the switch is green), a speaker recognition model based on the channel will be added to your transcript. This means, that the paragraphs of the resulting transcript, will be split based on the channels of that file. This is very useful for files that come from call-centers, as these always have two channels - the client channels and the agen channel. Please note, that you may use only one of the "Speakers Diarization" or "Multiple Channels" switches.
- "Add words to your custom vocabulary" input:This is useful when you want to tell our models to watch out for some words, and if it finds them in your transcript, to keep them as they are (e.g. both "ate" and "eight" words sound the same. By writting "ate" in this input, you will tell our model that when it hears either "ate" or "eight", you want to keep "ate"). Note you can add multiple words.
- "Select a boost param for these words" selector/input:This is tied up with the previous option. The higher this number is, the higher the chance that the word you want, will be kept (e.g. once again, we will use the "ate" and "eight" words. Our model, when it checks the sound of a word, it gives an accuracy. If that accuracy is lower than your boost param, then the word you added to the above input will be chosen. For example our model gives "ate" an accuracy of 4.73 and "eight" an accuracy of 8.31. You add the word "ate" in the above input, and you give it a boost of 9. In this case the model will choose the word "ate" with an accuracy of 4.73 over the word "eight" with an accuracy of 8.31. If you added the bost to 6, than the model would have choosen the word "eight"). Note that all words will have the same boost.
- "Save as default configuration" checkbox:If you check this box, than everything in the upload modal pop-up (from language to boost param) will be kept for you next time you upload a new file, so you won't have to switch again, add custom vocabulary etc.
Send the file with its options
After checking the options that you want, or need, you just need to press the green "Upload" button, and wait for the file to be sent to our servers to be automatically transcripted. After that, you can sit back, and relax. We should also note that the file usually gets transcripted in a quarter of its length (e.g. if the file has 4 hours, it will be done in about 1 hour).