Audio Beehive Monitoring Based on IoT-AI Techniques: A Survey and Perspective

In beekeeping, monitoring beehives plays an important role in making sure bee colonies stay healthy and in reducing negative effects that could happen in the colonies. A large number of studies have been carried out to improve the performance of monitoring beehives from the traditional manual methods. Most importantly, the application of artificial intelligence (AI) technologies in recent times have led to significant effects in the monitoring process. These new methods, however, have not yet been investigated or applied in Vietnam. To understand the use of AI-based technologies in the automatic monitoring of beehives, this paper provides a survey on beehive monitoring systems based on audio data and AI techniques. Opportunities and perspectives for the applications of these techniques in audio-based monitoring beehive in Vietnam are also discussed.


Introduction
The honey bee (Apis mellifera) has a crucial value to the agricultural industry around the world. Along with commercial productions such as honey, beeswax, and royal jelly for human consumption, bees play an essential part in pollinating plants. It is estimated that roughly one-third of all crops grown for human consumption are pollinated by honey bees (Klein et al., 2007). In recent years, many factors have led to a decline of honey bee colonies, and this fact requires continuous and extensive monitoring to investigate the factors that may negatively affect the life cycle of bees.
Monitoring beehives to gather information about the honey bee's status, activities, and behavior is one of the most important tasks in the process of beekeeping. It is not simply a check of the beehive for pollen, nectar, and brood quantities, but also an inspection of the presence of pests or diseases such as viruses, bacteria, fungi, and mite infections. Regular checks are critical for finding and preventing these problems. In practice, this task is usually performed by manual methods. However, traditional methods lead to many factors that make inspections not very effective. For example, the manual method is time-consuming, and it depends strongly on the experience and knowledge of beekeepers. Several symptoms may not be recognized by amateur or inexperienced beekeepers. It is also impossible for people to continuously monitor beehives to promptly detect problems related to a honey bee colony's status by this manual method. An external dismantling for internal inspection creates stress and panic in bees, and can disrupt the life cycles of bee colonies. Frequent disturbances can be a stressful factor and introduce anxiety to the swarm. In many production areas, the hives may be located far away from human habitats, resulting in additional transportation costs. Thus, modern technology-based methods to reduce these disadvantages of the traditional method in monitoring bee colonies are encouraged. The general idea of these new approaches is to use a system of electronic devices attached to each beehive to collect data. The collected data are then analyzed and processed to obtain important information that allows beekeepers to monitor beehives remotely without interrupting the life cycles of the honey bee colonies.
Among several types of data collected from bee colonies, the buzzing sound plays a vital role in automated beehive monitoring. This is because bee buzzing carries information on colony behavior and phenology. For example, honey bees emit specific sounds when they are exposed to stressors such as predatory mites, failing queens, and airborne toxicants (Bromenshenk et al., 2009). In practice, many beekeepers listen to their hives to find out the state of their honey bee colonies. They can recognize the changes in buzzing sounds produced by stressed colonies. However, they may not always be able to determine the exact causes of the changes without hive inspections. The new technology-based methods are very helpful and effective in these situations. They can provide the beekeepers information about the state of the hives so that they can take precautionary measures to restore the hives' health. By these methods, IoT-based beehive monitoring systems are designed to capture large amounts of useful information about the behaviors of the bees. They consist of several physical devices embedded in a hive with sensors, actuators, electronics, software, and a network connection that enable recording of the buzzing sounds made by the honey bees, changing them into electrical signals, and then passing them to a programmable logic controller. Since the data collected at this stage is raw, extraction algorithms are needed to extract critical information from these raw data before applying artificial intelligence (AI) algorithms for each specific problem related to monitoring beehives. It is well documented in the literature that analyzing bee buzzing data is an effective way to monitor the health of beehives. A large number of studies have been carried out on this topic. Thorough descriptions of bee sounds and their characteristics were given in papers by Bencsik et al. (2007) and . Dietlein (1985) analyzed the acoustic signals produced by bees and gave definitions of frequency bands. A simple method based on acoustic technology to monitor beehives is presented in an article by Vancata (1995). By using computational analysis of the acoustic vibrations captured with accelerometers, Ramsey et al. (2017) presented a method to identify a honey bee swarming event. The problem of investigating audio signals of honey bee swarms was also considered in the published work of Schultz et al. (2008). A distributed audio monitoring system for apiaries was developed by Mezquida & Martinez (2009). Recently, AI algorithms have emerged as effective method in recognizing and classifying beehive audio samples. For example, Kulyukin et al. (2018) provided a study on deep learning and machine learning methods for audio beehive monitoring. The obtained results from this study confirmed that AI algorithms could be used in real electronic beehive monitors. As in many other fields, these algorithms could bring a revolution in automatic beehive monitoring, and they have become a hot trend in recent times. To the best of our knowledge, the research and development of audio-based beehive monitoring systems have not been considered in Vietnam. As a result, these new technologies have not been applied in any Vietnamese bee farms. This paper aims to provide a survey of audio-based beehive monitoring techniques that could be useful for researchers or beekeepers who want to apply advanced technologies to their work. While there other types of data collection methods to consider, like image data, the systems to collect this type of data requires more. In addition, a camera for capturing bee images is much more expensive than the sound sensors, and the humid air and high water vapor conditions in Vietnam may lead to unclear images unless these devices are designed on demand. Therefore, we restricted this paper to the studies related to audio-based monitoring systems and the corresponding AI algorithms. The opportunities and perspectives of the applications of these technologies are also considered and discussed.

Materials and Methods
Among the three methods used to investigate the state of knowledge in a particular field, namely, "Delphi", "meta-analysis", and "content analysis" (Li & Cavusgil, 1995), we utilized the content analysis method in this study which is able to categorize and analyze references in a systematic way. By applying this method, we implemented two general steps. The first step was to determine the survey sources and the procedures for searching the relevant documents. The second step was to identify the categories for classifying and analyzing the references from the first step. Moreover, the following keywords were considered in the first step: monitoring beehive, honey bee monitoring, audio beehive monitoring, audio-based recognition, and bee sound analysis, among others.
In addition, we focused mainly on journal papers, however, other sources like proceedings, books, and PhD/MSc dissertations, which appeared in these papers, were also taken into account.

IoT-based hardware
The first step of any data collection project is to design and install a system to collect the data from beehives. This problem has been addressed by many researchers. Gil-Lebrero et al. (2017) provided an overview of systems designed for monitoring beehives from the early years of the twentieth century to recent years. It can be seen from this overview that the monitoring systems have greatly progressed thanks to the development and applications of the IoT technologies. An IoT-based monitoring system is made up of embedded devices that can acquire data from different sensors, process them, and communicate with a computer with a cloud database. According to the discussion in the paper by Zganks (2019), two main architectures of an IoT-based monitoring service can be designed in which the crucial point is the IoT module and its capacity for processing power and memory. The first one corresponds to the IoT module with low processing power and memory, i.e., its function is only to capture the bee signals in the beehive and transmit them over a wireless connection to a backend server in a cloud (or on the edge), where the AI algorithms take part. This architecture has a variant where the IoT module can carry out feature extraction and transmit feature vectors instead of audio signals. The second one corresponds to the case when the IoT module has enough processing power and memory to run AI algorithms locally in the beehive. The second architecture requires a more complex design and unlimited wireless connectivity, and thus, has a higher cost.
A general structure of the beehivemonitoring system, that meets several significant requirements, including (1) using a minimally invasive method, (2) being operational in remote areas for long periods, and (3) allowing real-time monitoring was described in detail by Gil Lebrero et al. (2017). The authors also presented descriptions of the hardware and software of the wireless node. Although the system was designed for capturing temperature and humidity data, it can be considered as a general structure for systems capturing other data like sounds by replacing the temperature and humidity sensors with sound sensors in the adapter board. Several specific systems have been designed to collect bee sound data in the literature based on this general structure. For example, the Raspberry Pi Model B (2016), which is a family of singleboard computers, approximately the size of a credit card developed by the non-profit Raspberry Pi Foundation, was considered in a range of published papers (Schurischuster et al., 2016;Antonio, 2017;Crawford, 2017). A multi-sensor electronic beehive monitoring system, called BeePi, has also been quite popular (Kulyukin et al., 2016;Amlathe, 2018;Kulyukin, 2019). Each BeePi monitor consists of exclusively off-the-shelf components: a Raspberry Pi computer, a miniature camera, a microphone splitter, four microphones connected to a splitter, a solar panel, a temperature sensor, a battery, and a hardware clock (Kulyukin et al., 2018). Another hardware platform for bee sound data acquisition was used by Cecchi et al. (2019) and Qandour et al. (2014).

Feature extraction
Feature extraction plays an essential role in machine learning pipelines. After being extracted from the raw data, the important features will be fed to analytical algorithms to detect specific problems related to the beehives. This step is performed by using the artificial intelligence (AI) algorithms, which are described below.
There are several approaches used to extract meaningful features for ML models in terms of bee sound analysis. The Mel-frequency cepstral coefficients (MFCCs) (Davis & Mermelstein, 1980) method appears most frequently in the literature to extract features from audio samples, see (Amlathe, 2018;Kulyukin et al., 2018;Zganks, 2018;Nolasco et al., 2019;Zganks, 2019). The pipeline of MFCCs implementation is depicted in Figure 1. In the article by Nolasco et al. (2019), MFCCs, Mel spectrograms, and HHTs (Huang & Schen, 2005) are considered to represent features for ML algorithms to recognize the presence of the queen. There have been experiments conducted to test various scenarios by combining those extractors as the inputs for traditional models and convolutional neural networks. The best result for traditional models was the use of a combination of MFCCs and HHTs. Zganks, (2018) proposed an approach to recognize bee swarm activities by applying MFCCs as an extractor. The results showed that the approach could be applied in a real-life setting. In another paper by Zganks (2019), the normalization of the cepstral mean was applied for the features extracted by using MFCCs. Also, linear predictive coding (LPC) was used to compare to MFCCs. The obtained results showed that the performance of models based on the MFCCs was better. The study by Amlathe (2018) utilized the combination of MFFCs, Chroma sfft, Mel spectrograms, and Tonnetz with 193 features in total as inputs for ML models to classify bee buzzing, cricket chirping, and ambient noise. This work has been extended in another study by Kulyukin et al. (2018) in which 34 features were extracted (i.e., temporal features of raw data, MFCCs, and magnitudes of DFT) and reduced to 13 features by applying principal component analysis (PCA). In order to improve the performance of standard ML models, the features selection techniques were also applied. Peak frequency (PF) and the mean of Short time Fourier transforms (STFT) were utilized in a ML model to detect Varroa mites (Qandour et al., 2014).
Besides using ML for recognition, bee behaviors also can be estimated by applying the techniques to analyze audio signals and audioprocessing. Ferrari et al. (2008) proposed a method to detect swarming in advance by analyzing spectrograms. The low-frequency spectrums produced by honey bees increase before a swarming event. Michelsen et al. (1986) proposed a method to analyze the frequency of wagging dance sounds and found a correlation between bees' dancing and the harmonics around 320 Hz. A method to assess the productivity of beehives has been proposed by Brundage (2012) by analyzing the frequency of bee sounds near the entrance of the beehive. Terenzi et al. (2019) investigated the performance of Wavelet transforms (WT) as a feature extraction technique for honey bee sound classifications in comparison to the SFTF, MFCCs, and HHTs. The conclusion was that the performance of WT based approaches was better than the others and the WT coefficients can be used in ML algorithms.

Recognition of the bee queen's absence
The queen bee plays a very important role in a colony. Her main mission in the colony is to release pheromones and produce eggs. Checking whether or not the queen bee is healthy and capable of laying eggs is perhaps one of the most critical tasks for beekeepers. The absence of the queen bee means that there are no more eggs or larvae, i.e. there are no younger bees that could replace the older ones. As a consequence, the life of the whole swarm will be exposed to danger since a worker bee only lives up to 40 days and their work is usually overtaken by another bee after that (Tautz, 2008). That is to say, the lack of a healthy queen bee is extremely unfavorable and this absence should be detected as soon as it is possible. The timely detection of the queen's absence will lead to an accurate understanding of the cause of the absence as well as an appropriate intervention to introduce a new queen bee to the colony. This is a major concern in several studies in the literature. The acoustic-based methods, based on spectrographic analysis and a Kohonen self-organising map (SOM) artificial neural network, have been used to investigate the queenless state (Howard et al., 2013). The obtained results showed that the relatively lowfrequency ranges for queenless hives can be described as bulges in the frequency spectrum, relative to normal activity, between 165Hz and 285Hz. However, the results from using a SOM to classify the data from different hives have been less successful. Robles-Guerrero et al.
(2017) used a logistic regression (LR) model for honey bee buzz recognition. The sound from two colonies of bees in the Mel frequency domain was recorded. The first was from a healthy colony with a queen and the second one was from a hive with no queen and with a reduced population. The obtained results showed that it was possible to classify both characteristics obtaining high classification rates using a reduced set of features. However, the authors also claimed that data analysis from more hives are needed in order to confirm these results. A support vector machine (SVM) with a Gaussian kernel was utilized by Cejrowski et al. (2018) to classify data from different states of the beehive. The SVM was also applied by Nolasco (2018) and Nolasci et al. (2019) but with a radial basis function kernel. In these studies, the SVM led to better accuracy in generalizing the unseen hives compared to the convolutional neural networks (CNN). However, the CNN achieved good results in a hive-dependent scenario which indicated the feasibility of the application of deep learning methods to this problem.

Recognition of the bee queen's absence
Swarming, a natural process in the life of a honey bee colony, is one of the activities in a beehive that requires an immediate response from beekeepers. This occurs when a large group of honey bees leaves an established colony and flies off to start a new colony and settle in a new hive. Since honey bee swarms may contain several hundred to several thousand worker bees, a few drones, and an old queen, the results of swarming influence the current status of the beehive and future productivity of the colony. Several studies have been carried out to predict swarming moments by individuating the main factors that could lead a hive to swarm (Seeley & Tautz, 2001;Visscher & Seeley, 2007;Rittschof & Seeley, 2008;Kridi et al., 2014). An overview of the factors that are connected with swarming and the related studies can be seen in the paper of Ferrari et al. (2008) in which a method for predicting the swarming period was proposed based on labeling the bees' sounds. Zganks (2018) used a hidden Markov model (HMM) for his experiment to separate the bee swarm activity from the routine daily activity based on the open audio data provided by the Open Source Beehives Project ("OSBeehives", 2020). The SVM algorithm, which was used to detect the queenless status, can also be applied to detect swarming in honey bee colonies as stated by Cejrowski et al. (2018). A system with Gaussian mixture models (GMM) and HMM with different topologies has also been proposed to analyze the swarm classification performance (Zganks, 2019). The evaluation of the proposed methods in monitoring bee swarm activity from the captured input audio signal shows promising results. However, there is a difference between the GMM and HMM approaches in the case of a medium topology that needs to be analyzed further to obtain better-classifying performance. The identification of the honey bee swarming process by analyzing the time course of hive vibrations based on a principal component analysis (PCA) method was introduced in the work of Bencsik et al. (2007).

Recognition of beehive sounds
The first fundamental task of any audiobased beehive monitoring technology is to recognize bee sounds and discriminate them from non-bee sounds that might be captured. The non-bee sounds are usually related to the environment and events occurring in the surroundings of the hive such as urban sounds, rain, or other animals like crickets. Once the systems fail to distinguish the bee sounds from these non-bee sounds, all the work related to data analysis for specific problems after that will also fail. Therefore, automatically detecting sounds produced by bees and distinguishing them from external non-related sounds in the audio data captured inside beehives is the main goal in a number of studies. Amlathe (2018) used four machine learning algorithms to train the classification models to detect bee sounds from audio samples, namely the LR algorithm, the Knearest neighbor (KNN) algorithm, the random forest (RF) algorithm, and the SVM algorithm. According to this study, the best accuracy for the three-way classification problem (aiming to classify audio samples into one of the three categories: bee buzzing, cricket chirping, or ambient noise, which were manually labeled) was achieved using the random forest algorithm, while the best accuracy for the day classification problem (aiming to classify audio samples into one of six-time categories in a day) was achieved using the KNN classifier. Other methods also achieved high accuracy. This study was extended to allow the authors to compare the performance of these four machine learning methods with several CNNs, i.e. the RawConvNet (the CNN that classifies raw audio samples) and three deeper SpectConvNets (the CNNs that classify audio spectrogram images from audio datasets) with different combinations of fully connected and convolution layers (Kulyukin et al., 2018). Their experiments indicated that the RawConvNet could perform better than the SpectConvNets, while these SpectConvNets outperformed the four machine learning model. That is to say, adding more layers may not necessarily result in improved classification performance in the audio datasets considered in this study. The authors also discussed the main trade-offs between the ConvNets and the machine learning methods, which was between feature engineering and training time. Comparing the performance of the SVM and the CNN for beehive sound recognition was also made based on the data recordings from the Open Source Beehive project (Nolasco & Benetos, 2018). These experiments showed that the SVM lead to slightly better results and less overfitting occurred for the SVM which may indicate better generalization capabilities for CNN. This result is in accordance with the findings in another paper by Nolasco et al. (2019).

Recognition of beehive sounds
Another application of AI techniques in monitoring beehives based on audio data is to detect pest infestations. This research has only emerged recently from a study in which the authors focused on developing a system that could recognize the different acoustic characteristics produced by a healthy colony and a colony infested with Varroa mites (Qandour et al., 2014). Within the last few decades, the global spread of the parasitic mite Varroa destructor, better known as the Varroa mite, has resulted in the loss of millions of honey bee (Apis mellifera) colonies, and it is undoubtedly the biggest mortal threat to honey bees now (Martin, 2001). Many studies have been carried out to find efficient methods to detect Varroa mites as soon as possible. However, most of these studies were based on image data (Ramrez et al., 2012;Schurischuster et al., 2016). In the work of Qandour et al. (2014), the authors used the PCA algorithm to narrow the features extracted from the feature extraction step. This reduced feature set was then put through a linear discriminant analysis (LDA) algorithm and the SVM algorithm. The obtained results showed that the LDA led to the highest accuracy, while the SVM resulted in a slightly improved computational time. In general, both these methods were capable of distinguishing between infected and healthy beehive sounds.

Opportunities for the application of AI techniques in audio-based beehive monitoring in Vietnam
In Vietnam, beekeeping provides a significant source of income, contributing to improving the lives of producers, as well as making an important contribution to Vietnam's export turnover. Each year, Vietnam produces over 55,000 tons of honey and more than 1,000 tons of beeswax, of which about 85-90% of the output is exported. Vietnam is ranked 6 th in the world and 2 nd in Asia in honey exports. However, it should also be pointed out that honey beekeeping in Vietnam still has several limitations, especially problems related to modern technologies. The production process is based purely on manual methods, including manual inspections, leading to low productivity and quality. In this section, we discuss some opportunities and further perspectives of the applications of AI techniques to improve the quality of the process of raising and producing honey in Vietnam.

Applying the available methods to monitor honey beehives
As discussed in the previous section, AI algorithms have been successfully applied to solve several important problems related to noninvasive monitoring beehives based on the analysis of sound generated within the beehives.
These problems involve the recognition of bee sounds, the recognition of the queen's absence, and the recognition of swarming. In several studies, not only the methods but also the designs of the systems have been described in detail (Cejrowski et al., 2018;Kulyukin et al., 2018). Moreover, the devices and sensors recording the bee sounds are reasonably priced and suitable for the conditions of production in Vietnam. Therefore, these results can be applied to develop AI techniques-based systems for classifying beehive states from bee sounds in bee farms in Vietnam. This will provide beekeepers with a continuous and autonomous analysis of their beehives. The architecture of our proposed approach is illustrated in Figure 2.

Exploring new AI algorithms for bee sound analysis
We have found from reviewing the references related to methods for monitoring beehives based on acoustic analyses in the literature that most of these studies focused on the problem of feature extraction from raw data. Although several AI algorithms, which are listed in Table 1, have been proposed to process the features from the extraction step, these algorithms were just some standard machine learning algorithms and some popular deep learning algorithms. Many other AI algorithms are well documented to have very good performance in recognizing and classifying audio data such as the Boltzmann machine (Salakhutdinov & Hinton, 2009), deep recurrent neural networks (Phan et al., 2017), long short term memory (LSTM) networks (Hochreiter & Schmidhuber, 1997), and LSTM autoencoders (Marchi et al., 2017), among others. These are promising algorithms for analyzing bee sounds since they have been shown to have very high performances in speech recognition. A problem that needs to be considered in using these algorithms is how to build a bee sound dataset large enough to train the classifying model.

Automatic identification of other important beehive states
The recognition of Varroa mites is just one of several problems related to the recognition of pest infestations based on audio data. After the study of Qandour et al. (2014), we could not find any other studies related to this problem in the literature. That is to say, more research is needed in this area. One could use other AI algorithms (as discussed in the previous subsection) or one could combine sound data analyses with other types of data like humidity and temperature within a beehive to improve the performance of the monitoring process (Ferrari et al., 2008). The system to collect these data would need to have a simple design and low cost, consistent with the production conditions in Vietnam.

Conclusions
An audio-based beehive monitoring system includes many components, ranging from designing an IoT-based system to the use of feature extraction and AI algorithms. Among them, the AI algorithms contribute significantly to the performance and accuracy of the monitoring Amlathe ( Howard (2013) system. This paper has presented a survey on the AI techniques used in an audio-based beehive monitoring system. Several opportunities for the applications of these advanced techniques to monitoring beehives in Vietnam have been discussed. This is a promising topic worthy of being considered to contribute to improving the quality of monitoring honey bee colonies in Vietnam.