By pagewriter
February 5, 2023
Earth Species Project (ESP) is an organization dedicated to decoding and ultimately communicating with non-human species, such as animals. CEO Katie Zacarian believes that artificial intelligence* can help make this goal a reality.
The first step in achieving this goal is to recognize patterns in animal language, and then use machine learning systems to analyze that data in order to understand it. In order to determine the potential meaning behind the patterns, scientists need to match the communication with corresponding behavior. ESP is utilizing this approach by studying birds, dolphins, primates, elephants, and honeybees.
The organization believes that marine mammals may be the key for the first breakthrough since much of their communication is done acoustically. If successful, this could lead to two-way conversations with animals…just like Doctor Dolittle! This could also open up new opportunities for considering our relationship with other species on earth. For example, should we ask whales to dive out of the way of boats when it changes their feeding habits? Or should boats change course?
These questions may have answers sooner than we think thanks to Earth Species Project’s pioneering research. By understanding what animals say, and being able to communicate back, we are entering a new era of relationship building between humans and animals which has huge implications for us all. It’s an exciting time for ESP as they continue their mission and bring us closer than ever before to understanding the languages of other species that inhabit our planet.
* During the early years of the Cold War, an array of underwater microphones monitoring for sounds of Russian submarines captured something otherworldly in the depths of the North Atlantic.
The haunting sounds came not from enemy craft, nor aliens, but humpback whales, a species that, at the time, humans had hunted almost to the brink of extinction. Years later, when environmentalist Roger Payne obtained the recordings from U.S. Navy storage and listened to them, he was deeply moved. The whale songs seemed to reveal majestic creatures that could communicate with one another in complex ways. If only the world could hear these sounds, Payne reasoned, the humpback whale might just be saved from extinction.
When Payne released the recordings in 1970 as the album Songs of the Humpback Whale, he was proved right. The album went multi-platinum. It was played at the U.N. general assembly, and it inspired Congress to pass the 1973 endangered species act. By 1986, commercial whaling was banned under international law. Global humpback whale populations have risen from a low of around 5,000 individuals in the 1960s to 135,000 today.
For Aza Raskin, the story is a sign of just how much can change when humanity experiences a moment of connection with the natural world. “It’s this powerful moment that can wake us up and power a movement,” Raskin tells TIME.
Raskin’s focus on animals comes from a very human place. A former Silicon Valley wunderkind himself, in 2006 he was first to invent the infinite scroll, the feature that became a mainstay of so many social media apps. He founded a streaming startup called Songza that was eventually acquired by Google. But Raskin gradually soured on the industry after realizing that technology, which had such capacity to influence human behavior for the better, was mostly being leveraged to keep people addicted to their devices and spending money on unnecessary products. In 2018, he co-founded the Center for Humane Technology with his friend and former Google engineer Tristan Harris, as part of an effort to ensure tech companies were shaped to benefit humanity, rather than the other way around. He is perhaps best known for, alongside scholar Renée DiResta, coining the phrase “freedom of speech is not freedom of reach.” The phrase became a helpful way for responsible technologists, lawmakers and political commentators to distinguish between the constitutional freedom for users to say whatever they like, and the privilege of having it amplified by social media megaphones.
Raskin is talking about whale song because he is also the co-founder and President of the Earth Species Project, an artificial intelligence (AI) nonprofit that is attempting to decode the speech of animals —from humpback whales, to great apes, to crows. The jury is out on whether it would ever truly be possible to accurately “translate” animal communication into anything resembling human language. Meaning is socially constructed, and animal societies are very different to ours.
Despite the seemingly insurmountable challenges the group is facing, the project has made at least some progress, including an experimental algorithm that can purportedly detect which individual in a noisy group of animals is "speaking."
A second algorithm reportedly can generate mimicked animal calls to "talk" directly to them.
"It is having the AI speak the language," Raskin told The Guardian, "even though we don’t know what it means yet."
AI-powered analysis of animal communication includes data sets of both bioacoustics, the recording of individual organisms, and ecoacoustics, the recording of entire ecosystems, according to experts. In October 2022, ESP published the first publicly-available benchmark for measuring the performance of machine learning algorithms in bioacoustics research. The system—known as BEANS (the BEnchmark of ANimal Sounds)—uses 10 datasets of various animal communications and establishes a baseline for machine learning classification and detection performance.
The datasets being studied in various efforts to decode animal communication include recordings from a range of species like birds, amphibians, primates, elephants and insects like honeybees. Communication from domesticated cats and dogs is being studied, too. Yet experts note that communication among cetaceans—whales, dolphins and other marine mammals—is especially promising.
“Cetaceans are particularly interesting because of their long history—34 million years as a socially learning, cultural species,” Zacarian explained. “And because—as light does not propagate well underwater—more of their communication is forced through the acoustic channel.”
Researchers maintain that bioacoustics and AI-powered analysis of animal communication can significantly advance ecological research and conservation efforts.
For instance, in 2021, researchers used audio recordings to identify a new species of blue whales in the Indian Ocean. “Each blue whale population has a distinct vocal signature, which can be used to distinguish and monitor different ‘acoustic populations’ or ‘acoustic groups”, the research team explained in a Nature article detailing the discovery.
Moreover, listening to ecosystems and decoding animal communication can help ecologists gauge the health of the natural environment, experts say. This includes, for instance, developing a better understanding of how distributive human activity like noise population or logging affects animal populations. In Costa Rica, for example, audio recordings were used recently to evaluate the development and health of reforested areas of the rainforest.
“By monitoring the sounds that are coming from nature, we can look for changes in social structure, transmission of cultural information or physiological stress,” Zacarian stated.
AI analysis of animal communication has also been used to help establish marine animal protection zones. Off the West Coast of the United States, for example, researchers have used AI to analyse marine communication recordings as well as shipping route data to create “mobile marine protected areas” and predict potential coalitions between animals and ships.
“Understanding what animals say is the first step to giving other species on the planet ‘a voice’ in conversations on our environment,” said Kay Firth-Butterfield, the World Economic Forum’s head of AI and machine learning.
“For example, should whales be asked to dive out of the way of boats when this fundamentally changes their feeding or should boats change course?”
There are ethical concerns that researchers are confronting, too. This includes, most notably, the possibility of doing harm by establishing two-way communication channels between humans and animals—or animals and machines.
“We’re not quite sure what the effect will be on the animals and whether they even want to engage in some conversations,” Bakker stated. “Maybe if they could talk to us, they would tell us to go away.”
Researchers are taking steps to address and mitigate the concerns about harm and animal exploitation. ESP, for instance, is working with its partners to develop a set of principles to guide its research and ensure it always supports conservation and animal wellbeing.
“We are not yet sure what all the real-world applications of this technology will be,” Zacarian stated. “But we think that unlocking an understanding of the communications of another species will be very significant as we work to change the way human beings see our role, and as we figure out how to co-exist on the planet.”
See: https://pagetraveler.com/humans-may-be-shockingly-close-to-decoding-the-language-of-animals/
See: https://bigthink.com/life/artificial-intelligence-animal-languages/
Understanding animal vocalisations has long been the subject of human fascination and study. Various primates give alarm calls that differ according to predator; dolphins address one another with signature whistles; and some songbirds can take elements of their calls and rearrange them to communicate different messages. But most experts stop short of calling it a language, as no animal communication meets all the criteria.
Until recently, decoding has mostly relied on painstaking observation. But interest has burgeoned in applying machine learning to deal with the huge amounts of data that can now be collected by modern animal-borne sensors. “People are starting to use it,” says Elodie Briefer, an associate professor at the University of Copenhagen who studies vocal communication in mammals and birds. “But we don’t really understand yet how much we can do.”
Briefer co-developed an algorithm that analyses pig grunts to tell whether the animal is experiencing a positive or negative emotion. Another, called DeepSqueak, judges whether rodents are in a stressed state based on their ultrasonic calls. A further initiative – Project CETI (which stands for the Cetacean Translation Initiative) – plans to use machine learning to translate the communication of sperm whales.
Yet ESP says its approach is different, because it is not focused on decoding the communication of one species, but all of them. While Raskin acknowledges there will be a higher likelihood of rich, symbolic communication among social animals – for example primates, whales and dolphins – the goal is to develop tools that could be applied to the entire animal kingdom. “We’re species agnostic,” says Raskin. “The tools we develop… can work across all of biology, from worms to whales.”
The “motivating intuition” for ESP, says Raskin, is work that has shown that machine learning can be used to translate between different, sometimes distant human languages – without the need for any prior knowledge.
This process starts with the development of an algorithm to represent words in a physical space. In this many-dimensional geometric representation, the distance and direction between points (words) describes how they meaningfully relate to each other (their semantic relationship). For example, “king” has a relationship to “man” with the same distance and direction that “woman’ has to “queen”. (The mapping is not done by knowing what the words mean but by looking, for example, at how often they occur near each other.)
It was later noticed that these “shapes” are similar for different languages. And then, in 2017, two groups of researchers working independently found a technique that made it possible to achieve translation by aligning the shapes. To get from English to Urdu, align their shapes and find the point in Urdu closest to the word’s point in English. “You can translate most words decently well,” says Raskin.
ESP’s aspiration is to create these kinds of representations of animal communication – working on both individual species and many species at once – and then explore questions such as whether there is overlap with the universal human shape. We don’t know how animals experience the world, says Raskin, but there are emotions, for example grief and joy, it seems some share with us and may well communicate about with others in their species. “I don’t know which will be the more incredible – the parts where the shapes overlap and we can directly communicate or translate, or the parts where we can’t.”

Dolphins use clicks, whistles and other sounds to communicate. But what are they saying? Photograph: ALesik/Getty Images/iStockphoto
He adds that animals don’t only communicate vocally. Bees, for example, let others know of a flower’s location via a “waggle dance”. There will be a need to translate across different modes of communication too.
The goal is “like going to the moon”, acknowledges Raskin, but the idea also isn’t to get there all at once. Rather, ESP’s roadmap involves solving a series of smaller problems necessary for the bigger picture to be realised. This should see the development of general tools that can help researchers trying to apply AI to unlock the secrets of species under study.
For example, ESP recently published a paper (and shared its code) on the so called “cocktail party problem” in animal communication, in which it is difficult to discern which individual in a group of the same animals is vocalising in a noisy social environment.
“To our knowledge, no one has done this end-to-end detangling [of animal sound] before,” says Raskin. The AI-based model developed by ESP, which was tried on dolphin signature whistles, macaque coo calls and bat vocalisations, worked best when the calls came from individuals that the model had been trained on; but with larger datasets it was able to disentangle mixtures of calls from animals not in the training cohort.
Another project involves using AI to generate novel animal calls, with humpback whales as a test species. The novel calls – made by splitting vocalisations into micro-phonemes (distinct units of sound lasting a hundredth of a second) and using a language model to “speak” something whale-like – can then be played back to the animals to see how they respond. If the AI can identify what makes a random change versus a semantically meaningful one, it brings us closer to meaningful communication, explains Raskin. “It is having the AI speak the language, even though we don’t know what it means yet.”

Hawaiian crows are well known for their use of tools but are also believed to have a particularly complex set of vocalisations. Photograph: Minden Pictures/Alamy
A further project aims to develop an algorithm that ascertains how many call types a species has at its command by applying self-supervised machine learning, which does not require any labelling of data by human experts to learn patterns. In an early test case, it will mine audio recordings made by a team led by Christian Rutz, a professor of biology at the University of St Andrews, to produce an inventory of the vocal repertoire of the Hawaiian crow – a species that, Rutz discovered, has the ability to make and use tools for foraging and is believed to have a significantly more complex set of vocalisations than other crow species.
Rutz is particularly excited about the project’s conservation value. The Hawaiian crow is critically endangered and only exists in captivity, where it is being bred for reintroduction to the wild. It is hoped that, by taking recordings made at different times, it will be possible to track whether the species’s call repertoire is being eroded in captivity – specific alarm calls may have been lost, for example – which could have consequences for its reintroduction; that loss might be addressed with intervention. “It could produce a step change in our ability to help these birds come back from the brink,” says Rutz, adding that detecting and classifying the calls manually would be labour intensive and error prone.
Meanwhile, another project seeks to understand automatically the functional meanings of vocalisations. It is being pursued with the laboratory of Ari Friedlaender, a professor of ocean sciences at the University of California, Santa Cruz. The lab studies how wild marine mammals, which are difficult to observe directly, behave underwater and runs one of the world’s largest tagging programmes. Small electronic “biologging” devices attached to the animals capture their location, type of motion and even what they see (the devices can incorporate video cameras). The lab also has data from strategically placed sound recorders in the ocean.
ESP aims to first apply self-supervised machine learning to the tag data to automatically gauge what an animal is doing (for example whether it is feeding, resting, travelling or socialising) and then add the audio data to see whether functional meaning can be given to calls tied to that behaviour. (Playback experiments could then be used to validate any findings, along with calls that have been decoded previously.) This technique will be applied to humpback whale data initially – the lab has tagged several animals in the same group so it is possible to see how signals are given and received. Friedlaender says he was “hitting the ceiling” in terms of what currently available tools could tease out of the data. “Our hope is that the work ESP can do will provide new insights,” he says.
But not everyone is as gung ho about the power of AI to achieve such grand aims. Robert Seyfarth is a professor emeritus of psychology at University of Pennsylvania who has studied social behaviour and vocal communication in primates in their natural habitat for more than 40 years. While he believes machine learning can be useful for some problems, such as identifying an animal’s vocal repertoire, there are other areas, including the discovery of the meaning and function of vocalisations, where he is sceptical it will add much.
The problem, he explains, is that while many animals can have sophisticated, complex societies, they have a much smaller repertoire of sounds than humans. The result is that the exact same sound can be used to mean different things in different contexts and it is only by studying the context – who the individual calling is, how are they related to others, where they fall in the hierarchy, who they have interacted with – that meaning can hope to be established. “I just think these AI methods are insufficient,” says Seyfarth. “You’ve got to go out there and watch the animals.”

A map of animal communication will need to incorporate non-vocal phenomena such as the “waggle dances” of honey bees. Photograph: Ben Birchall/PA
There is also doubt about the concept – that the shape of animal communication will overlap in a meaningful way with human communication. Applying computer-based analyses to human language, with which we are so intimately familiar, is one thing, says Seyfarth. But it can be “quite different” doing it to other species. “It is an exciting idea, but it is a big stretch,” says Kevin Coffey, a neuroscientist at the University of Washington who co-created the DeepSqueak algorithm.
Raskin acknowledges that AI alone may not be enough to unlock communication with other species. But he refers to research that has shown many species communicate in ways “more complex than humans have ever imagined”. The stumbling blocks have been our ability to gather sufficient data and analyse it at scale, and our own limited perception. “These are the tools that let us take off the human glasses and understand entire communication systems,” he says.
See: https://www.theguardian.com/science...telligence-really-help-us-talk-to-the-animals
Animals have developed their own ways of communication over millions of years, while human speech—and, therefore, language—couldn’t have evolved until the arrival of anatomically modern Homo sapiens about 200,000 years ago (or, per a fossil discovery from 2017, about 300,000 years ago). This line of thinking became known as laryngeal descent theory**, or LDT.
A review paper published in 2019 in Science Advances
(https://www.science.org/doi/10.1126/sciadv.aaw3916), aims to tear down the LDT completely. Its authors argue that the anatomical ingredients for speech were present in our ancestors much earlier than 200,000 years ago. They propose that the necessary equipment—specifically, the throat shape and motor control that produce distinguishable vowels—has been around as long as 27 million! years, when humans and Old World monkeys (baboons, mandrills, and the like) last shared a common ancestor.
In any case, decoding and ultimately communicating with non-human species is extremely difficult and it may need to wait until the advent of the quantum computer for us to be able to have a chat with our dog, cat or horse, let alone the honey bee or a blue whale.
Hartmann352
** Laryngeal descent theory - refers to a movement of the larynx away from the oral and nasal cavities in humans or other mammals, either temporarily during vocalization (dynamic descent) or permanently during development (permanent descent).
It has been known since the nineteenth century that adult humans are unusual in having a descended larynx. In most mammals, the resting position of the larynx is directly beneath the palate, at the back of the oral cavity, and the epiglottis (a flap of cartilage at the top of the larynx) can be inserted into the nasal passage to form a sealed respiratory passage from the nostrils to the lungs. In humans, in contrast, the larynx descends away from the palate during infancy, and adults can no longer engage the larynx into the nasal passages. This trait was once thought to be unique to humans and to play a central role in our ability to speak.
See: https://link.springer.com/referenceworkentry/10.1007/978-3-319-16999-6_3348-1
February 5, 2023
Earth Species Project (ESP) is an organization dedicated to decoding and ultimately communicating with non-human species, such as animals. CEO Katie Zacarian believes that artificial intelligence* can help make this goal a reality.
The first step in achieving this goal is to recognize patterns in animal language, and then use machine learning systems to analyze that data in order to understand it. In order to determine the potential meaning behind the patterns, scientists need to match the communication with corresponding behavior. ESP is utilizing this approach by studying birds, dolphins, primates, elephants, and honeybees.
The organization believes that marine mammals may be the key for the first breakthrough since much of their communication is done acoustically. If successful, this could lead to two-way conversations with animals…just like Doctor Dolittle! This could also open up new opportunities for considering our relationship with other species on earth. For example, should we ask whales to dive out of the way of boats when it changes their feeding habits? Or should boats change course?
These questions may have answers sooner than we think thanks to Earth Species Project’s pioneering research. By understanding what animals say, and being able to communicate back, we are entering a new era of relationship building between humans and animals which has huge implications for us all. It’s an exciting time for ESP as they continue their mission and bring us closer than ever before to understanding the languages of other species that inhabit our planet.
* During the early years of the Cold War, an array of underwater microphones monitoring for sounds of Russian submarines captured something otherworldly in the depths of the North Atlantic.
The haunting sounds came not from enemy craft, nor aliens, but humpback whales, a species that, at the time, humans had hunted almost to the brink of extinction. Years later, when environmentalist Roger Payne obtained the recordings from U.S. Navy storage and listened to them, he was deeply moved. The whale songs seemed to reveal majestic creatures that could communicate with one another in complex ways. If only the world could hear these sounds, Payne reasoned, the humpback whale might just be saved from extinction.
When Payne released the recordings in 1970 as the album Songs of the Humpback Whale, he was proved right. The album went multi-platinum. It was played at the U.N. general assembly, and it inspired Congress to pass the 1973 endangered species act. By 1986, commercial whaling was banned under international law. Global humpback whale populations have risen from a low of around 5,000 individuals in the 1960s to 135,000 today.
For Aza Raskin, the story is a sign of just how much can change when humanity experiences a moment of connection with the natural world. “It’s this powerful moment that can wake us up and power a movement,” Raskin tells TIME.
Raskin’s focus on animals comes from a very human place. A former Silicon Valley wunderkind himself, in 2006 he was first to invent the infinite scroll, the feature that became a mainstay of so many social media apps. He founded a streaming startup called Songza that was eventually acquired by Google. But Raskin gradually soured on the industry after realizing that technology, which had such capacity to influence human behavior for the better, was mostly being leveraged to keep people addicted to their devices and spending money on unnecessary products. In 2018, he co-founded the Center for Humane Technology with his friend and former Google engineer Tristan Harris, as part of an effort to ensure tech companies were shaped to benefit humanity, rather than the other way around. He is perhaps best known for, alongside scholar Renée DiResta, coining the phrase “freedom of speech is not freedom of reach.” The phrase became a helpful way for responsible technologists, lawmakers and political commentators to distinguish between the constitutional freedom for users to say whatever they like, and the privilege of having it amplified by social media megaphones.
Raskin is talking about whale song because he is also the co-founder and President of the Earth Species Project, an artificial intelligence (AI) nonprofit that is attempting to decode the speech of animals —from humpback whales, to great apes, to crows. The jury is out on whether it would ever truly be possible to accurately “translate” animal communication into anything resembling human language. Meaning is socially constructed, and animal societies are very different to ours.
Despite the seemingly insurmountable challenges the group is facing, the project has made at least some progress, including an experimental algorithm that can purportedly detect which individual in a noisy group of animals is "speaking."
A second algorithm reportedly can generate mimicked animal calls to "talk" directly to them.
"It is having the AI speak the language," Raskin told The Guardian, "even though we don’t know what it means yet."
AI-powered analysis of animal communication includes data sets of both bioacoustics, the recording of individual organisms, and ecoacoustics, the recording of entire ecosystems, according to experts. In October 2022, ESP published the first publicly-available benchmark for measuring the performance of machine learning algorithms in bioacoustics research. The system—known as BEANS (the BEnchmark of ANimal Sounds)—uses 10 datasets of various animal communications and establishes a baseline for machine learning classification and detection performance.
The datasets being studied in various efforts to decode animal communication include recordings from a range of species like birds, amphibians, primates, elephants and insects like honeybees. Communication from domesticated cats and dogs is being studied, too. Yet experts note that communication among cetaceans—whales, dolphins and other marine mammals—is especially promising.
“Cetaceans are particularly interesting because of their long history—34 million years as a socially learning, cultural species,” Zacarian explained. “And because—as light does not propagate well underwater—more of their communication is forced through the acoustic channel.”
Researchers maintain that bioacoustics and AI-powered analysis of animal communication can significantly advance ecological research and conservation efforts.
For instance, in 2021, researchers used audio recordings to identify a new species of blue whales in the Indian Ocean. “Each blue whale population has a distinct vocal signature, which can be used to distinguish and monitor different ‘acoustic populations’ or ‘acoustic groups”, the research team explained in a Nature article detailing the discovery.
Moreover, listening to ecosystems and decoding animal communication can help ecologists gauge the health of the natural environment, experts say. This includes, for instance, developing a better understanding of how distributive human activity like noise population or logging affects animal populations. In Costa Rica, for example, audio recordings were used recently to evaluate the development and health of reforested areas of the rainforest.
“By monitoring the sounds that are coming from nature, we can look for changes in social structure, transmission of cultural information or physiological stress,” Zacarian stated.
AI analysis of animal communication has also been used to help establish marine animal protection zones. Off the West Coast of the United States, for example, researchers have used AI to analyse marine communication recordings as well as shipping route data to create “mobile marine protected areas” and predict potential coalitions between animals and ships.
“Understanding what animals say is the first step to giving other species on the planet ‘a voice’ in conversations on our environment,” said Kay Firth-Butterfield, the World Economic Forum’s head of AI and machine learning.
“For example, should whales be asked to dive out of the way of boats when this fundamentally changes their feeding or should boats change course?”
There are ethical concerns that researchers are confronting, too. This includes, most notably, the possibility of doing harm by establishing two-way communication channels between humans and animals—or animals and machines.
“We’re not quite sure what the effect will be on the animals and whether they even want to engage in some conversations,” Bakker stated. “Maybe if they could talk to us, they would tell us to go away.”
Researchers are taking steps to address and mitigate the concerns about harm and animal exploitation. ESP, for instance, is working with its partners to develop a set of principles to guide its research and ensure it always supports conservation and animal wellbeing.
“We are not yet sure what all the real-world applications of this technology will be,” Zacarian stated. “But we think that unlocking an understanding of the communications of another species will be very significant as we work to change the way human beings see our role, and as we figure out how to co-exist on the planet.”
See: https://pagetraveler.com/humans-may-be-shockingly-close-to-decoding-the-language-of-animals/
See: https://bigthink.com/life/artificial-intelligence-animal-languages/
Understanding animal vocalisations has long been the subject of human fascination and study. Various primates give alarm calls that differ according to predator; dolphins address one another with signature whistles; and some songbirds can take elements of their calls and rearrange them to communicate different messages. But most experts stop short of calling it a language, as no animal communication meets all the criteria.
Until recently, decoding has mostly relied on painstaking observation. But interest has burgeoned in applying machine learning to deal with the huge amounts of data that can now be collected by modern animal-borne sensors. “People are starting to use it,” says Elodie Briefer, an associate professor at the University of Copenhagen who studies vocal communication in mammals and birds. “But we don’t really understand yet how much we can do.”
Briefer co-developed an algorithm that analyses pig grunts to tell whether the animal is experiencing a positive or negative emotion. Another, called DeepSqueak, judges whether rodents are in a stressed state based on their ultrasonic calls. A further initiative – Project CETI (which stands for the Cetacean Translation Initiative) – plans to use machine learning to translate the communication of sperm whales.
Yet ESP says its approach is different, because it is not focused on decoding the communication of one species, but all of them. While Raskin acknowledges there will be a higher likelihood of rich, symbolic communication among social animals – for example primates, whales and dolphins – the goal is to develop tools that could be applied to the entire animal kingdom. “We’re species agnostic,” says Raskin. “The tools we develop… can work across all of biology, from worms to whales.”
The “motivating intuition” for ESP, says Raskin, is work that has shown that machine learning can be used to translate between different, sometimes distant human languages – without the need for any prior knowledge.
This process starts with the development of an algorithm to represent words in a physical space. In this many-dimensional geometric representation, the distance and direction between points (words) describes how they meaningfully relate to each other (their semantic relationship). For example, “king” has a relationship to “man” with the same distance and direction that “woman’ has to “queen”. (The mapping is not done by knowing what the words mean but by looking, for example, at how often they occur near each other.)
It was later noticed that these “shapes” are similar for different languages. And then, in 2017, two groups of researchers working independently found a technique that made it possible to achieve translation by aligning the shapes. To get from English to Urdu, align their shapes and find the point in Urdu closest to the word’s point in English. “You can translate most words decently well,” says Raskin.
ESP’s aspiration is to create these kinds of representations of animal communication – working on both individual species and many species at once – and then explore questions such as whether there is overlap with the universal human shape. We don’t know how animals experience the world, says Raskin, but there are emotions, for example grief and joy, it seems some share with us and may well communicate about with others in their species. “I don’t know which will be the more incredible – the parts where the shapes overlap and we can directly communicate or translate, or the parts where we can’t.”

Dolphins use clicks, whistles and other sounds to communicate. But what are they saying? Photograph: ALesik/Getty Images/iStockphoto
He adds that animals don’t only communicate vocally. Bees, for example, let others know of a flower’s location via a “waggle dance”. There will be a need to translate across different modes of communication too.
The goal is “like going to the moon”, acknowledges Raskin, but the idea also isn’t to get there all at once. Rather, ESP’s roadmap involves solving a series of smaller problems necessary for the bigger picture to be realised. This should see the development of general tools that can help researchers trying to apply AI to unlock the secrets of species under study.
For example, ESP recently published a paper (and shared its code) on the so called “cocktail party problem” in animal communication, in which it is difficult to discern which individual in a group of the same animals is vocalising in a noisy social environment.
“To our knowledge, no one has done this end-to-end detangling [of animal sound] before,” says Raskin. The AI-based model developed by ESP, which was tried on dolphin signature whistles, macaque coo calls and bat vocalisations, worked best when the calls came from individuals that the model had been trained on; but with larger datasets it was able to disentangle mixtures of calls from animals not in the training cohort.
Christian RutzIt could produce a step change in our ability to help the Hawaiian crow come back from the brink
Another project involves using AI to generate novel animal calls, with humpback whales as a test species. The novel calls – made by splitting vocalisations into micro-phonemes (distinct units of sound lasting a hundredth of a second) and using a language model to “speak” something whale-like – can then be played back to the animals to see how they respond. If the AI can identify what makes a random change versus a semantically meaningful one, it brings us closer to meaningful communication, explains Raskin. “It is having the AI speak the language, even though we don’t know what it means yet.”

Hawaiian crows are well known for their use of tools but are also believed to have a particularly complex set of vocalisations. Photograph: Minden Pictures/Alamy
A further project aims to develop an algorithm that ascertains how many call types a species has at its command by applying self-supervised machine learning, which does not require any labelling of data by human experts to learn patterns. In an early test case, it will mine audio recordings made by a team led by Christian Rutz, a professor of biology at the University of St Andrews, to produce an inventory of the vocal repertoire of the Hawaiian crow – a species that, Rutz discovered, has the ability to make and use tools for foraging and is believed to have a significantly more complex set of vocalisations than other crow species.
Rutz is particularly excited about the project’s conservation value. The Hawaiian crow is critically endangered and only exists in captivity, where it is being bred for reintroduction to the wild. It is hoped that, by taking recordings made at different times, it will be possible to track whether the species’s call repertoire is being eroded in captivity – specific alarm calls may have been lost, for example – which could have consequences for its reintroduction; that loss might be addressed with intervention. “It could produce a step change in our ability to help these birds come back from the brink,” says Rutz, adding that detecting and classifying the calls manually would be labour intensive and error prone.
Meanwhile, another project seeks to understand automatically the functional meanings of vocalisations. It is being pursued with the laboratory of Ari Friedlaender, a professor of ocean sciences at the University of California, Santa Cruz. The lab studies how wild marine mammals, which are difficult to observe directly, behave underwater and runs one of the world’s largest tagging programmes. Small electronic “biologging” devices attached to the animals capture their location, type of motion and even what they see (the devices can incorporate video cameras). The lab also has data from strategically placed sound recorders in the ocean.
ESP aims to first apply self-supervised machine learning to the tag data to automatically gauge what an animal is doing (for example whether it is feeding, resting, travelling or socialising) and then add the audio data to see whether functional meaning can be given to calls tied to that behaviour. (Playback experiments could then be used to validate any findings, along with calls that have been decoded previously.) This technique will be applied to humpback whale data initially – the lab has tagged several animals in the same group so it is possible to see how signals are given and received. Friedlaender says he was “hitting the ceiling” in terms of what currently available tools could tease out of the data. “Our hope is that the work ESP can do will provide new insights,” he says.
But not everyone is as gung ho about the power of AI to achieve such grand aims. Robert Seyfarth is a professor emeritus of psychology at University of Pennsylvania who has studied social behaviour and vocal communication in primates in their natural habitat for more than 40 years. While he believes machine learning can be useful for some problems, such as identifying an animal’s vocal repertoire, there are other areas, including the discovery of the meaning and function of vocalisations, where he is sceptical it will add much.
The problem, he explains, is that while many animals can have sophisticated, complex societies, they have a much smaller repertoire of sounds than humans. The result is that the exact same sound can be used to mean different things in different contexts and it is only by studying the context – who the individual calling is, how are they related to others, where they fall in the hierarchy, who they have interacted with – that meaning can hope to be established. “I just think these AI methods are insufficient,” says Seyfarth. “You’ve got to go out there and watch the animals.”

A map of animal communication will need to incorporate non-vocal phenomena such as the “waggle dances” of honey bees. Photograph: Ben Birchall/PA
There is also doubt about the concept – that the shape of animal communication will overlap in a meaningful way with human communication. Applying computer-based analyses to human language, with which we are so intimately familiar, is one thing, says Seyfarth. But it can be “quite different” doing it to other species. “It is an exciting idea, but it is a big stretch,” says Kevin Coffey, a neuroscientist at the University of Washington who co-created the DeepSqueak algorithm.
Raskin acknowledges that AI alone may not be enough to unlock communication with other species. But he refers to research that has shown many species communicate in ways “more complex than humans have ever imagined”. The stumbling blocks have been our ability to gather sufficient data and analyse it at scale, and our own limited perception. “These are the tools that let us take off the human glasses and understand entire communication systems,” he says.
See: https://www.theguardian.com/science...telligence-really-help-us-talk-to-the-animals
Animals have developed their own ways of communication over millions of years, while human speech—and, therefore, language—couldn’t have evolved until the arrival of anatomically modern Homo sapiens about 200,000 years ago (or, per a fossil discovery from 2017, about 300,000 years ago). This line of thinking became known as laryngeal descent theory**, or LDT.
A review paper published in 2019 in Science Advances
(https://www.science.org/doi/10.1126/sciadv.aaw3916), aims to tear down the LDT completely. Its authors argue that the anatomical ingredients for speech were present in our ancestors much earlier than 200,000 years ago. They propose that the necessary equipment—specifically, the throat shape and motor control that produce distinguishable vowels—has been around as long as 27 million! years, when humans and Old World monkeys (baboons, mandrills, and the like) last shared a common ancestor.
In any case, decoding and ultimately communicating with non-human species is extremely difficult and it may need to wait until the advent of the quantum computer for us to be able to have a chat with our dog, cat or horse, let alone the honey bee or a blue whale.
Hartmann352
** Laryngeal descent theory - refers to a movement of the larynx away from the oral and nasal cavities in humans or other mammals, either temporarily during vocalization (dynamic descent) or permanently during development (permanent descent).
It has been known since the nineteenth century that adult humans are unusual in having a descended larynx. In most mammals, the resting position of the larynx is directly beneath the palate, at the back of the oral cavity, and the epiglottis (a flap of cartilage at the top of the larynx) can be inserted into the nasal passage to form a sealed respiratory passage from the nostrils to the lungs. In humans, in contrast, the larynx descends away from the palate during infancy, and adults can no longer engage the larynx into the nasal passages. This trait was once thought to be unique to humans and to play a central role in our ability to speak.
See: https://link.springer.com/referenceworkentry/10.1007/978-3-319-16999-6_3348-1
Last edited: