Realization of real-time speech recognition system in home monitoring robot

Summary:

The paper describes the design of speech recognition system in the family monitoring robot project. Through the parallel processing of DSP, DMA and ARM Cortex-A8, the real-time speech recognition system based on ATK is implemented on embedded Linux by double buffering. The software and hardware of the system are designed in this paper. In terms of hardware, the hardware composition principle of the speech recognition system is given, and the key parts of the schematic diagram are provided. On the software side, the real-time speech recognition method is proposed, and the application implementation flow is given. Finally, the speech recognition experiment was carried out by real people, and the real-time speech recognition rate reached 94.67%. The experiment verified the correctness of the software and hardware design of the system.

Speech is the most common form of communication for human beings and the most desirable way for humans and computers to communicate. Therefore, using voice to communicate with computers has become a hot topic of recent research. Computer understanding of speech is a fascinating and challenging topic in computer science.

In the 1990s, with the advent of the multimedia era, voice recognition systems were urgently required to move from the laboratory to practical use. Many developed countries such as the United States, Japan, South Korea, and famous companies such as IBM, Apple, AT&T, and NTT have invested heavily in the practical development of speech recognition systems. IBM developed the Chinese ViaVoice speech recognition system in 1997, and in the following year developed ViaVoice'98, a speech recognition system that can recognize local accents such as Shanghainese, Cantonese and Sichuan dialects. Voice recognition phones and voices have appeared on the market. Identify products such as Notepad, such as Voice Organizer from VPTC in the United States and Parrot from France.

The research work on speech recognition in China started late, but it has developed very fast in recent years. It has been closely following the international level. The state also attaches great importance to it and has included the study of large vocabulary speech recognition in the "8 63" plan, by the Institute of Acoustics of the Chinese Academy of Sciences. The Institute of Automation, the Department of Electronic Engineering of Tsinghua University, and the Peking University and other units have achieved high-level scientific research results, such as the non-specific person, continuous speech dictation system and Chinese phonetic human-machine dialogue system developed by the Institute of Automation of the Chinese Academy of Sciences. The rate or system response rate can reach more than 90%. In view of China's huge market in the future, foreign countries also attach great importance to the study of Chinese speech recognition. The United States, Singapore and other places have gathered a group of scholars from the mainland, Taiwan, Hong Kong and other places, and the research results have reached a very high level.

1 system design

This paper is the design part of the speech recognition system in the family monitoring robot project. The design purpose is to design a robot that can recognize the voice and assist the guardian family inconvenient. In order to realize the speech recognition system, a block diagram of the overall structure of the speech recognition system is designed, as shown in Figure 1.

Figure 1 System overall structure block diagram

1.1 Hardware Design

The functions studied and designed in this paper are applied to mobile robots. Therefore, the research design of the system needs to take into account the characteristics of small size, power saving, and easy movement, and needs a friendly display interface that is convenient for home users to operate. For the speech recognition part, a processor for speech recognition algorithm processing, a speech acquisition circuit, and a speech output circuit are required, as shown in FIG. The processor of the speech recognition algorithm is mainly responsible for the arithmetic processing of the algorithm, which is equivalent to the brain of the robot; the speech acquisition circuit is responsible for collecting the external sound signal, which is equivalent to the ear of the robot; the speech output circuit is responsible for outputting the voice of the speech, which is equivalent to the mouth of the robot. .

Figure 2 system hardware structure diagram

1) Speech recognition algorithm processor selection

According to the requirements of system design functions, the types of commonly used speech recognition chips are: single-chip microcomputer (MCU), DSP and SoC (System on Circuit). Considering the shortage of common single-chip microcomputer (MCU) resources and the shortcoming of slow operation speed, the system design will not consider the use of single-chip microcomputer (MCU) as the processor for speech recognition. DSP contains special components for digital signal processing, with strong computing power and high precision. However, the current price of DSP is relatively high. At the same time, considering the characteristics of this system, it is necessary to choose a kind of strong computing power, suitable for speech recognition. The ability to select a DSP is not a smart feature, and it has a better user interface and a file system (for identifying maps). At present, Texas Instruments' new chip OMAP3530, which has a dual-core ARM CortexTM-A8 core and TMS320C64+TM DSP core, is a high-performance OMAP35x architecture series that meets the various functional requirements of system design.

2) Voice codec chip selection

It is very important that the robot choose a suitable speech processing chip. Considering that various power sources are used in the system and the power supply needs to be managed, it is very suitable to select TI's TPS 65930 chip as the hardware platform for the audio codec processing function of the system voice recognition part. The chip is a chip integrated with power management, ADC, embedded power control (EPC), full-featured audio codec, which meets all the power management and audio codec requirements of the system, saving PCB board design. The space, while reducing the wiring troubles of multi-power hardware design.

3) Circuit design

The design of this paper is used on mobile robots, so it requires the input of voice, recognition processing and voice output. For voice input acquisition, this paper uses the sound sensor microphone and peripheral circuits to achieve. For the voice output section, a power amplifier is used in conjunction with the speaker. The schematic diagram of the design speech part is shown in Figure 3.

Figure 3 voice input schematic

Recessed Tabletop Socket

Flush Mount Socket,Recessed Tabletop Socket,Concealed Mounting Socket,Recessed Mounted Power Strip

Dongguan baiyou electronic co.,ltd , https://www.dgbaiyou.com