Data Mining and Machine Learning in Earth Observation – An Application for Tracking Historical Algal Blooms


Alexandria Dominique Farias and Gongling Sun, International Space University, France


The data produced from Earth Observation (EO) satellites has recently become so abundant that manual processing is sometimes no longer an option for analysis. The main challenges for studying this data are its size, its complex nature, a high barrier to entry, and the availability of datasets used for training data. Because of this, there has been a prominent trend in techniques used to automate this process and host the processing in massive online cloud servers. These processes include data mining (DM) and machine learning (ML). The techniques that will be discussed include: clustering, regression, neural networks, and convolutional neural networks (CNN). This paper will show how some of these techniques are currently being used in the field of earth observation as well as discuss some of the challenges that are currently being faced. Google Earth Engine (GEE) has been chosen as the tool for this study. GEE is currently able to display 40 years of historical satellite imagery, including publicly available datasets such as Landsat, and Sentinel data from Copernicus. Using EO data from Landsat and GEE as a processing tool, it is possible to classify and discover historical algal blooms over the period of ten years in the Baltic Sea surrounding the Swedish island of Gotland. This paper will show how these technical advancements including the use of a cloud platform enable the processing and analysis of this data in minutes.


Earth Observation, Remote Sensing, Satellite Data, Data Mining, Machine Learning, Google Earth Engine, Algal Blooms, Phytoplankton Bloom, Cyanobacteria

Full Text  Volume 10, Number 2