Every parent has been in this position: the baby cries and cries, and we don’t know why. Is it hungry? Is it tired? Ana Laguna, data scientist at BBVA Data & Analytics, asked herself the same questions when her first child was born. Ana looked for an app that would help her understand why her baby was crying. But she didn’t have any luck. Then she thought she could translate her baby’s crying and understand what it needed any time by using the artificial intelligence techniques that she uses everyday at work.
Thus her project was conceived, and she began making audio recordings of her newborn crying. She had compiled a total of 65 recordings by the time her son was four months old. "Among the many questions expecting parents ask themselves is ‘Will I understand him or her?’ After looking for apps that would help me understand my newborn and not finding anything satisfactory, I set myself the challenge of trying to translate babies’ crying,” explained Ana Laguna.
"I wasn't able to start recording him until I was more comfortable and I started to identify why he was crying," she adds. Then she began tagging the recordings, assigning a rationale to each one in order to create a library of audio files that, with the help of an algorithm, would identify different patterns that could be useful to future parents. “The algorithm is an artificial neural network that learns what you teach it,” she explains.
"I had worked in the area of automatic translation in the past, and the crying of a baby is just another oral mode of human communication. Also, if Jane Goodall can understand chimpanzees, why not try translating a newborn using an AI algorithm?", says Ana.
With her 65 audio files, she was able to determine that her baby cried fundamentally for four reasons: because it was hungry, tired, was in some kind of pain, or wanted affection. This was depicted by the graphical patterns – called spectrograms – that she produced for the project.
The graphical patterns that Ana Laguna produced for the project.
After this preliminary study, Ana concluded that the audio signals were valid — that there were differences that allowed one to distinguish between the types of audio, and that her model's accuracy was acceptable. However, the size of the dataset was inadequate. "For an algorithm of this type, the number of available audio files was limited," she says. Furthermore, having data from only a single baby isn’t sufficient for the purposes of making broader conclusions, since the dataset would be skewed. In short, the project is in its infancy. "It’s still in diapers, almost literally," says Ana.
Consequently, her primary focus now is to obtain more data, and if possible, already tagged data. Why is the priority to get more data? "Data is the stone in the rough, waiting to be polished. Obviously, the end result will differ depending on whether the raw material is a diamond or a quartz," says Ana. The data scientist emphasizes that data tagging is very important. “This tagging", she explains, "should be done by the child's parents,” in order to be able to attribute a motive to each kind crying and enlarge the pattern. Ana plans to enlarge the dataset with the recordings of additional crying babies.
"I am also interested in variables such as nationality or gender because, how a baby cries might vary, for example, depending on where its parents come from, depending on different cultural aspects present in the accent or intonation of its mother. So, an algorithm that has been trained using recordings of babies from other countries might not be as exact as it would be using only Spanish children,” she points out. This is called prosody, a branch of linguistics that analyzes and formally represents those elements of oral expression such as accent, stress, and intonation.
Having more audio recordings in the database will facilitate re-evaluating the model with new data. Ana also plans to continue experimenting with other next-generation techniques like machine learning to improve the algorithm. And from there, she’ll be able to create an app or algorithm that everyone can use.
Ana asserts that "this kind of app isn't only useful for parents; it's for anyone who takes care of infants" - family members or professionals caregivers. In the future, it could even be possible to identify changes in crying, for example based on the type of pain the baby exhibits or the early diagnosis of specific diseases.
Data for the common good
Ana works under the auspices of a project called 'Data 4 Social Good', which aims to develop social projects where data science can be applied. On the website, a collaboration form is available, which parents can use to submit audio recordings of their babies. "With their submission, they have to include a tag of the reason why the baby is crying, the parents’ nationalities, and the sex of the baby.” Currently Ana receives an average of 10 recordings a day.
Because the study’s aim is to identify the basic needs of newborns, the audio files included in the analysis should capture babies’ crying only up to six months of age. "I stopped by the fourth month, because by month five I noticed that there were differences in the audio signal.”
"We are definitely going to exceed 500 recordings,” Ana declares. "With 1,000 we would have a sufficient database to develop a reliable model," she concludes.