Migratory Bird classification and analysis Aparna Pal

Migratory Bird classification and analysis Aparna Pal apal4@wisc.edu

Abstract The use of classification vectors to classify land and seabirds act as a first step to pattern classification of migratory patterns of several species of birds based on this preliminary classification. Based on the overall migratory patterns of land and seabirds we can anticipate the changes to migratory patterns based on climate change. Furthermore, fluctuations in migratory patterns can be used as a useful index for global climate change due to the wide span of geography covered during migration. Introduction In the latter portion of the 21 st century, shifting climate patterns have been linked to erratic travelling routes and nesting patterns spanning several species of migratory birds. Higher nest mortality has also been seen as many of these species time their migration with the location of the sun, not temperature. Therefore many brooding pairs will arrive to early or late in the season to be able to properly feed their chicks, as insect reproduction habits are based on temperature, not time of the year. Exceptions have been seen in some smaller birds with shorter migratory ranges such as the common sparrow. It has been hypothesized that in recent years the sparrow has been able to change its migratory pattern to be more temperature dependent than its larger compatriots. However, for the most part, as climate change continues nesting sites of these animals have been gradually shifting, leading birds that have not been habitualized to human interaction further and further into city-scapes, causing higher numbers of avian mortality. The original motivation of this project was to predict changes in avian migration patterns based on previous data. This would allow us to help offset the damage done to endangered species. One prime example of how this can be done is the whooping crane. By 2003 it was estimated that only 153 pairs of whooping cranes survived in the US due to their nesting sites

existing in decreasing numbers of swamplands, many of which are near human populations. Conservationists have been making efforts to hand raise chicks and lead them to safer, more isolated nesting sites within the US. Many of these safer sites were picked by analyzing the common migratory routes of whooping cranes in past years in order to find a locations that could be easily reached by wild-bred birds in the future. Analyzing the routes of other species would give us preliminary data in which to conserve other endangered and threatened species if needed. Furthermore, as climate change and migratory patterns have been linked, keeping an eye on erratic patterns can give us a good index of climate change on a global scale. Methodology and Datasets The most complete data set of avian migratory patterns exists on paper within the USGS. These records have been very difficult for individuals to acquire and analyze. As it stands, most avian analysis work had to be taken from several different research groups and stations and parsed together. As these results are not comprehensive, large gaps within migratory patterns exist for almost all species. Most likely, doctorate classifications would be needed to access this information as well as a formal request to the USGS. This is most likely due to the fact that this data is kept by a government funded facility and has not been stored in a way that allows for easy remote access. Therefore, it made more sense to change the approach of the problem from a prediction model to a classification problem. Taking the following data sets, migratory birds were classified as shorebirds or land birds based upon their location. Fortunately some standards appeared within all the above data sets, giving a stable set of feature vectors. Using the previously stated feature vector, a Java classifier was developed using string comparison of the feature vectors using training and testing sets, then further analyzed through

Matlab. A weighted Euclidian distance formula was found to be sufficient for classification. (See Appendix for code and data sets that were drawn from) Results and Discussion Although a Euclidian distance calculation was deemed sufficient for the purposes previously stated, there were some hiccups within the classification results. It was found that using nearest neighbors gave the optimal classification rate, with a jump from 76.4% to 83.9% between 10 and 11 neighbors. However, 12 neighbors yielded a rate in the high 70 s (about 78%) and continued to decrease back down to 76%. The reason for this lower classification rate most likely has to do with the data sets used. Since they came from multiple different sites there may have been an issue within pre-processing that was not accounted for. Furthermore, between the sets there were quite a few gaps in features, so only 5 features could be used in classification, which most likely had the most to do with poor classification. Data was also found to be unreliable and much patchier pre-2009. Therefore only a few years of data could be used in classification and analysis. This most likely lead to a much less stable classifier. Conclusion Based on the initial results of this experiment, there are several steps that can be taken from here that could be helpful for both this classifier as well as possible conservation work that was initially projected for this project. By contacting USGS with a formal request, it may be possible to at least get a partial sample of the complete dataset of migratory patterns they have stored. This would be a large improvement from the cobbled together dataset that was used in this experiment. Furthermore, based on the improved classifier that could be developed using this much more stable and complete dataset, it could be possible to create a migration prediction model for endangered species. To do so a larger dataset should be used as a predictor would need

more feature vectors for accuracy than a classifier would. It may also be prudent to look into what other learning models would be optimal to create a predictor. However, as it stands the work done to create a classifier allows for good preliminary data on what is left to be done. References 1) "IBP the MAPS Program." IBP the MAPS Program. Web. 20 Mar. 2016. 2) "California Avian Data Center." California Avian Data Center. Web. 20 Mar. 2016. 3 ) Cotton P.A, 2003 Avian migration phenology and global climate change. Proc. Natl Acad. Sci. USA. 100,12219 12222. 4) Richard Easterbrook, 2013, Ninigret National Wildlife Refuge Banding Summary 2008-2012 5) U.S. Fish and Wildlife Service, 2012, Whooping Crane Eastern Partnership Annual Monitoring Report 2008-2012 6) J. Wang and J.-D. Zucker, Solving the multiple-instance problem: a lazy learning approach, in Proceedings of the 17th International Conference on Machine Learning, Stanford, CA, 2000, pp. 1119 1125.