TechTarget.com/searchcustomerexperience

https://www.techtarget.com/searchcustomerexperience/definition/speech-recognition

What is speech recognition?

By Paul Kirvan

Speech recognition, or speech-to-text, is the ability of a machine or program to identify words spoken aloud and convert them into readable text. Rudimentary speech recognition software has a limited vocabulary and might only identify words and phrases that are spoken clearly. More sophisticated software can handle natural speech, different accents and various languages.

Speech recognition uses a broad array of research in computer science, linguistics and computer engineering. Many modern devices and text-focused programs have speech recognition functions in them to allow for easier or hands-free use of a device. They differ from text-to-speech systems, in which the system analyses text content and converts the text into spoken audio.

Speech recognition and voice recognition are two different technologies and shouldn't be confused.

How does speech recognition work?

Speech recognition systems use computer algorithms to process and interpret spoken words and convert them into text. A software program turns the sound a microphone records into written language that computers and humans can understand, following these four steps:

  1. Analyze the audio.
  2. Break it into parts.
  3. Digitize it into a computer-readable format.
  4. Use an algorithm to match it to the most suitable text representation.

Speech recognition software must adapt to the highly variable and context-specific nature of human speech. The software algorithms that process and organize audio into text are trained on different speech patterns, speaking styles, languages, dialects, accents and phrasings. The software also separates spoken audio from background noise that often accompanies the signal.

To meet these requirements, speech recognition systems use two types of models:

Types of speech recognition

Speech recognition software can be either speaker-dependent or speaker-independent:

There are three types of speech recognition data. Each corresponds to the manner of input.

What applications use speech recognition?

Speech recognition systems have quite a few applications:

What are the features of speech recognition systems?

Good speech recognition programs let users customize them to their needs. The features that enable this include the following components:

What are the different speech recognition algorithms?

The power behind speech recognition features comes from a set of algorithms and technologies. They include the following:

 

Advantages of speech recognition

There are several advantages to using speech recognition software:

Disadvantages of speech recognition

While convenient, speech recognition technology still has some limitations:

Speech recognition evolution and future

Speech recognition is an evolving technology. It's one of the ways people can communicate with computers with little or no typing. A variety of communications-based business applications capitalize on the convenience and speed of spoken communication that this technology enables.

In the early days of speech recognition, the primary limiting factors were computer processing speeds and memory size. Algorithms such as HMM had been developed and tested in the 1980s, but computers weren't powerful enough to handle compute-intensive automatic speech recognition (ASR). With the advent of microprocessors, cloud computing and enhanced automation of ASR technologies, those restrictions have disappeared.

Continued development of NLP and large language models -- augmented by AI, machine learning and neural networks -- has dramatically improved ASR performance. Multiple languages, accents and unique speech characteristics, plus faster conversion speeds, make speech recognition an increasingly valuable and viable tool.

Speech recognition programs have advanced greatly over 60 years of development, and they're still improving. Widespread adoption of advanced generative AI systems like OpenAI's ChatGPT are likely to become closely intertwined with speech recognition technology.

AI is changing speech recognition technology in many different ways. Find out the latest AI-driven speech recognition trends and use cases.

20 Nov 2024

All Rights Reserved, Copyright 2019 - 2025, TechTarget | Read our Privacy Statement