- A fully managed automatic speech recognition (ASR) service.
- Converts speech into text.
- It supports a wide variety of audio coding formats such as WAV, MP3, MP4, FLAC, AMR, AMR-WB, Ogg, and WebM.
- It can process batch and streaming transcriptions.
Common Use Cases
- Transcribing customer calls
- Meeting transcription
- Closed captioning
- Generating metadata to create a searchable archive
- A confidence score is between 0 and 100, indicating the probability that a given prediction is correct.
- Low-fidelity (lo-fi) is a term used to describe audio recordings that exhibit poor sound quality. The term high-fidelity refers to high-quality audio recordings.
- Automatic content redaction
- A process that censors sensitive information within the transcript output.
- Replaces redacted information with the [PII] tag.
- Custom Vocabulary
- Helps improve accuracy for content that has business-specific terms such as medical or legal terms.
- Vocabulary Filtering
- Allows you to create a list of words to filter from the transcript.
- Useful for blocking profanities.
- Multiple speaker recognition
- Supports identifying up to a maximum of ten speakers.
- Capable of transcribing low-fidelity and high-fidelity audio files.
- Uses machine learning to provide punctuation and grammatical formatting, making the transcription output immediately usable.
- It adds timestamps to every word, making it easier to use for movie subtitles.
- Includes confidence scores at each result so you can easily pinpoint the sections where further editing is required.
- Charges batch and streaming transcription jobs at a monthly rate of $0.0004 per second.
- Billed in 1-second increments
- Minimum request charge of 15 seconds.