Speech Processing

Crying is the only tool of communication for an infant. It is the first sign of life in infant after his birth. It is a part of normal behaviour. Infant cry can be considered a biological alarm system. There are varieties of reasons due to which infant cry including hunger, pain, sadness etc. [CRYING_INFANT].All the crying sounds seems to be similar, only a mother can distinguish the crying sound of her child from the other infants. [Step by step].Infant cries for the same reason for which adult's talk, to let others know about their problems and needs. Infants basically express their need and problems through cry. Infant cry is the only signal that reflects certain states of infant such as need or pain. [MARKOV MODELS]. Infant cries seem to be uniform and indistinguishable from each other to most of the people.So, to draw a reliable decision as to why an infant is crying, it is useful to automate a system of classification. Generally, Infant cry is to alert caregivers to need. Infant cry contains lots of information which helps to analysis the sound and generates useful results. Basically infant's cry conveys the information related to his needs or infant's health. To know the exact cause of crying is very much difficult because infant can express only by cry. If any disorder occurs to the infant then their cry may differ. [ICEICE022].Different studies have been done to analyze the acoustic characteristics of the infant which is further used for various applications and purposes [IEEEI2012 SUBMISSION]. The research presented here dealt with the cry types that include hunger, has colic, sadness, unhappy, pain. The system describes further deals with the analysis of crying sound to classify the cry event into one of the five types. The sounds are recorded then correlated. There is input to their respective processing blocks, each of which determines the reason for the cry. [Full text].

Spectrogram is the only tool used for analysis of infant cry in 1960s or 1970s.In the beginning of 1960s, various studies have been established to find the relationship between various diseases and cry signal characterstics.Most of the studies used time or frequency human analysis. Spectrogram plots time on x-axis and frequency on y-axis and it is produced by the analog device. It is the visual representation of the spectrum of frequencies in sound or other signal that varies with time. The complete study on 'Pathological cries, stridors and coughs in infants' in 1980s was purely based on spectrograms. In this study they deduced various diseases in infants by using spectrograms. Spectrography technique was also used by Michelson in 1990s for defining healthy and unhealthy cry types. Many attributes of infant cry can be obtained by spectrograph such as length of cry, spectral components, shape of melody etc. Spectrography is still a general tool for analysis of infant cry but it contains no information about the exact phase of signal that it represents.
In 1996, Schonweiler et al. classified the melody shapes of crying in six categories. These were: rising, falling, and flat, rising-falling, falling-rising and glottal positive. In 1999, Subjective methods were used by Moller and Schonweiler for the analysis of infant cry. Various nurses were appointed who works with the infants every day, had the most suitable answers for the subjective experiences. Further, Schonweiler et al. found that there is difference in the duration of cry signal between healthy infants and infants with hearing problems.
The study had been published in 2002 about the development of melodies by Wermke et al. In this study both shape of melody and intensity of cry were investigated together. They found that shapes of melody are simple in case of 8-9 weeks old infants but there are doubled or tripled melody shapes in case of more old infants. In 2003, Change of duration and fundamental frequency of crying and babbling signal had been found by Rothganger. It has been found that there is increment in fundamental frequency of crying signal and decrement in babbling but in case of crying duration, increment was there in both crying as well as babbling.[VARALLAYAY_2007 PERIODIC].Lind and Wermke used special hardware and software to analyze the infant cry of the child from his birth till the age of 3 months. They divided the cry signals into two groups according to the time limit i.e. 0.8secs.Then they compared both groups, there is no significant difference found. [MELODY SHAPE].
In 2007, Varallyay introduced a new idea about the objective analysis of infant cry. The objective analysis of infant cry will allow automatic classification of the infant cry. He classified the melody of crying into 77 different categories. Another study was performed by Varallyay et al. in which they obtained that infants with life of two months has shorter and simple melodies while later the duration also increased and melody shapes also got more complex.[all papers related to biomedical].A system is generated by Cohen and Zmora which is based on the study of hunger and pain cry of healthy infants. [EMBEC02].

In the present study we detect the emotions in the speech. Emotions are the prominent elements always present in the mind of the human beings. We basically deduced the reason of crying of infant by speech processing.
SPEECH PROCESSING is an example of digital signal processing which is used for analysis and processing of speech signals. The signals are processed with digital representation. Analysis in signal processing is done in two ways: time domain analysis and frequency domain analysis. In time domain, analysis of signal with respect to time is done. Fourier transform is done to convert the signal from time domain to frequency domain. To understand the properties of signal, analysis of signal is done in frequency domain.[ELECTRICAL VIEW] .The main task of speech analysis is to extract features and parameters from the speech signal. The only way for human communication is speech. It has multilayered temporal-spectral variations that convey words, expression, intention, state of health of human, gender, age and emotions. Speech signals convey more information then spoken words. The changes in the vibration of vocal fold, vocal tract resonance, duration and vocal tract spectrum conveyed the emotions and health of human. The speech signal is usually analyzed using spectral features, instead of directly using its waveform. There are two reasons for this. One is that the speech signal is considered to be reproducible by summing the sinusoidal waves, the amplitude and phase of which change slowly. The other is that the critical features for perceiving speech by the human ear are mainly included in the spectral information (with the phase information not usually playing a key role. All the information contained in a signal is conveyed within the bandwidth of 4Khz.If the energy of speech exceeds 4 KHz then it conveys audio quality and sensation. By using several techniques in signal processing, infant cry is analyzed. Basically mother plays an important role in taking care of child. She knows why her baby is crying what is reason behind his crying. But today mother works equally as the father and she has less time to take care of her family.So,all work has been automated as technologies has improved so much. Analysis of infant cry by speech processing is an automatic method which helps mother to take care of her child in her busy schedule.

The most sensitive range of human auditory sensation system is crying. Infant cry reflects the neuropsychological integrity of an infant because infant cry is generated by the central nervous system. Infant cry is very useful in detecting any kind of risk or health problems. [iceice].An infant cry in another way if he is hungry ,in pain or he is sad.[varallayay acoustic review].Infant cry contain important information about the state of health of the child. There are three main categories in which investigations related to infant cry is classified: reason of crying, development of crying, connection between diseases and crying. [varallayay forum acquisition].Infant cry is a biological alarm which alert the caregivers about the need and wants of the infant. It can be interpreted as a message of urgency or distress. The sound is nature's way of ensuring that parents attend to the baby as quickly as possible. The sound is perceived as an alarm, and it is very frustrating not to be able to figure out what's wrong and soothe the baby. Parents, especially first-time parents, begin to question their ability to cope if the child frequently cannot be comforted. Lots of information is obtained from the infant cry, even we can detect the reason for which infant is crying. The infant cry is composed of three components: voiced sound, radiation and resonances and it is produced by the human voice-production system. The sound produced is harmonic and it contain the fundamental frequency and its harmonics [G.Varallyay, 2007]. A normal infant cry is characterized by an average fundamental frequency (f0) of 450 Hz and ranks between 400 Hz and 600 Hz symmetrically overlapped harmonics and with cry durations between 1 and 1.5 s in average [Maria et al.,2012]. All of the crying signals seem to be similar but only mother can understand the reason behind his crying or the reason of crying can also be identified by automatic methods. Infant basically cry for the same reasons that adults talk so that they can convey their problem to their parents [Martin Herman, Audrey Le, 2007].Different cry origins such as pain,hunger,sadness exhibit different crying patterns. It is possible to distinguish between different emotions in the infant's cry based on the sound of crying around the age of 2 months. It is very difficult to know the emotions of infant i.e. why he is crying for the people who are not experienced in child care so automatic detection methods are generated. [satohetal07].

The interaction between control of different areas in the brain, respiratory control and vocal fold vibrations results in infant cry. At early stage, it is believed that a cry is the result of respiratory action. The external or internal stimuli (hunger, pain etc) initiate the cry production in the infant's brain in the first stage. Then in the next stage acoustic signals are created at the physiological level by the brain command which is translated into series of commands through the nervous system to the speech and respiratory limbs. Then process continues with the ejection of air from the lungs to the vocal tract and the nasal tract. Then cry is produced which is in the form of radiations which are further used for analysis. [iceice022].

Diagram [varallyay_biomed]

Infant cry consists of cry sounds and inspiration. It is the part of the expiratory phase of respiration with sound. The sound produced is actually phonation produced by larynx, which contains the vocal cords or folds and the glottis. Glottis is the opening between the folds. Larynx performed three functions: swallowing, breathing and voice production. When function breathing is performed, glottis is fully open and when voice production is there, then glottis is closed. When air is passed through closed vocal cords, then increased air pressure due to passing through a constricted tube results in a drop in air pressure causing the vocal cords to open and close rapidly with the speed of 250 to 450 Hz in normal healthy new borns. There are three modes of cry of an infant: basic cry or phonation or fundamental frequency (f0), high pitch cry or hyper phonation in the frequency range of 1000-2000Hz and noisy or turbulent cry.[assessment _cry].Crying model consists of three levels of central processing of the muscles contributing to the source and filters of crying. These levels are: upper, middle and lower processors. The state of the mind is determined by the upper processor. The infant's vegetative states such as swallowing, coughimg, digestion and crying are involved in the middle processor. The muscle groups including subglottal, superglottal, glottal and facial muscles are controlled by the lower processor. These muscles are coordinated in the act of crying.[varallyay2007_periodic polytechnic].From all the discussion it concludes the infant cry contains important information about the health of child and this information can be obtained by the suitable analysis of the cry and this information is used to detect the state of health of the child and the emotions or reason of crying of an infant.

Infant cry analysis main motive is to improve child-caregiver communication. By this infant can more clearly communicate non-verbally and in turn caregiver are better able to serve the infant with his needs leading to less frustration both in infants and caregiver. There are advantages and disadvantages both during the analysis of infant cry. Basically, the purpose of this research is to detect the reason why infant is crying, what is the reason behind his crying or in one word to detect the emotions whether he is hungry, in pain, in sadness. If we detect the reason of crying, it is very easy for the mother or caregiver to meet his needs quickly and try to relax him. We can easily know the state of health of the child by analyzing his crying. Then we can give him treatment as quickly as possible so that he did not suffer through the diseases for the long time. In some cases, infant cries for the long period i.e. 3 months or more and this kind of excessive crying is known as colic. We can also detect this kind of crying with the following research. The infant cry is the natural way to interact and by using automatic methods of analysis it is not necessary to sit beside the child 24/7 to observe him. We can easily know the reason of his crying by automatic methods. It can help the infants with cognitive disabilities. But disadvantages are also there in analysis. Even the best automatic systems sometimes make errors. If there is noise or some other sound in the room (e.g. the television or a kettle boiling), the number of errors will increase and we cannot detect the exact reason of crying. Infant cry analysis works best if the microphone is close to the user (e.g. in a phone, or if the user is wearing a microphone). More distant microphones (e.g. on a table or wall) will tend to increase the number of errors. While recording the cry of an infant, we have to face so much difficulty because sometimes we have to force them to cry for the purpose of analysis and these procedures make the infants irritable. Sometimes methods are less accurate that means the automatic methods do not give right results due to background noise which dominates the sound of crying. So we have to minimize the background noise to minimize the inefficiency.
In the present work, we worked on the analysis of infant cry which plays an important role in determining the state of health of child and their emotions. In today's life men and women both play equal role in their life. Women too work for living their life on their own so they cannot give more time to their infant. They cannot take care of their child by sitting beside him 24/7, due to this sometimes they did not even know the reason behind the crying of their own baby. To eliminate this problem automatic methods are introduced which helps them to know the reason of crying of an infant and their emotions so that they provide them the best thing they need. By analysing the infant cry we can easily know the state of health of child or any diseases from which he is suffering so that we can quickly give them the treatment for their recovery. The infant cry contains relevant information which helps us to differentiate between healthy and pathological crying. The information extracted from the observation of the crying waves recording used to emit diagnosis. [Infant cry]To make the whole process easier and faster we developed a method to automatically identify the reason of crying and the emotion of infant in which we compared the crying sound of any infant with the crying sounds in database having with the reason of crying. These automatic methods for the analysis of infant cry plays important role now a days and has so much of scope because it is very easy to obtain the reason of crying and then using this information they meet the need of the infants.


MATLAB is high level language developed by mathworks and used as an interactive environment for numeric computation, visualization and programming. Any data can be analyzed, algorithms can be developed and new models and application can be created using MATLAB.It helps us to reach the solution faster than the spreadsheets or traditional languages like C/C++ and helps us to explore new approaches. We can use MATLAB for various purposes or range of applications that includes signal processing, image and video processing and computational biology. [Mathworks].Basically MATLAB comes from various backgrounds of engineering, science and technology.MATLAB is developed in late 1970s by Cleve Moler, the chairman of the computer science department at the University of New Mexico. All the application built in MATLAB are around matlab language and most use of MATLAB is done on command window by typing the code on it or executing text files containing MATLAB code.MATLAB can also be interfaced with other different languages. It can call functions or subroutines written in the C Programming language. Libraries written in the JAVA, Perl or .NET can be directly called from the MATLAB. [Wikipedia]. Audio and speech processing importance is raised in everyday lives of the people. Digital processing is now the method of choice for handling audio and speech. In the present research we will cover representing, playing and plotting sound signals in Matlab. Sound signals can be represented as the vectors and we can do any mathematical operations on the sound signals that we can do on vectors. . The audio vector can be loaded and saved in the same way as any
Other Matlab variable, processed, added, plotted, and so on. We can also create new sounds by making new codes, scripts and functions. Signal processing provides many ways of modifying signals, which you may want to do to enhance the quality of the signal, transform it for communications or just to create sound effects or computer music. MATLAB is the easiest way of generating new codes on audio and speech processing and also give accurate results with greater efficiency. It is basically fourth generation programming language.

Source: Essay UK - http://www.essay.uk.com/free-essays/science/speech-processing.php

About this resource

This Science essay was submitted to us by a student in order to help you with your studies.

Search our content:

  • Download this page
  • Print this page
  • Search again

  • Word count:

    This page has approximately words.



    If you use part of this page in your own work, you need to provide a citation, as follows:

    Essay UK, Speech Processing. Available from: <https://www.essay.uk.com/free-essays/science/speech-processing.php> [01-06-20].

    More information:

    If you are the original author of this content and no longer wish to have it published on our website then please click on the link below to request removal: