Introduction 

Changes in global temperatures have resulted in more frequent and intense rainfall events, which can lead to flooding in areas that are not equipped to handle large amounts of water. Deforestation can also contribute to increased flooding. Trees and other vegetation play a key role in regulating the flow of water, and their removal can result in more water running off the surface and into rivers and streams.
Flood monitoring with satellite images is an effective method of detecting and tracking floods. This approach involves the use of satellite imagery to detect changes in water levels and identify flooded areas. To monitor floods using satellite images, the images are analyzed to detect changes in water levels over time. To detect changes in water levels and identify flooded areas based on a set of predefined criteria, we can train algorithms. Amazon SageMaker geospatial capabilities make it easier for data scientists and machine learning (ML) engineers to build, train, and deploy ML models using geospatial data.  These capabilities also provide pre-trained models. One of the pre-trained models is land cover segmentation model. This land cover segmentation model can be run with a simple API call and can be leverage to analyze changes in the water level.

Solution Overview

Let’s further understand how SageMaker geospatial capabilities make it easy to build, train, and deploy models using geospatial data. For flood monitoring use case, we will be using Sentinel 2 data from ASDI - Amazon Sustainability Data Initiative. In this blog post, we will first show you how to leverage new Sagemaker geospatial capabilities to visualize geospatial images from Sentinel-2 and further process these images to segment and classify the water coverage. This will help to analyze flood in defined area

Prerequisites

To get hands-on experience with all the features described in this post, complete the following prerequisites:
Ensure that you have an AWS account, secure access to log in to the account via the AWS Management Console, and AWS Identity and Access Management (IAM) permissions to use Amazon SageMaker and Amazon Simple Storage Service (Amazon S3) resources.
Onboard to a SageMaker domain and access Studio to use notebooks. For instructions, refer to Onboard to Amazon SageMaker Domain. If you’re using existing Studio, upgrade to the latest version of Studio.

Solution walkthrough with Jupyter Notebook using SageMaker Geospatial API

Data access

The new geospatial capabilities in SageMaker offer easy access to geospatial data such as Sentinel-2 and Landsat 8. Built-in geospatial dataset access saves weeks of effort otherwise lost to collecting data from various data providers and vendors.  First, we will use an Amazon SageMaker Studio notebook with a SageMaker geospatial image by following steps outlined in Getting Started with Amazon SageMaker geospatial capabilities. We use a SageMaker Studio notebook with a SageMaker geospatial image for our analysis.
The amazon-sagemaker-examples GitHub repository https://github.com/aws/amazon-sagemaker-examples/tree/main/sagemaker-geospatial/lake-mead-drought-monitoring contains similar notebooks that served as the basis for this article.  You can easily query data using SageMaker geospatial capabilities. We initially create a bounding box around the Mississippi river to represent an AreaOfInterest (AOI). In order to choose data from June 2018 to June 2019, we use the TimeRangeFilter. Often cloud cover can block our view of the location. To obtain less cloudy images, we will select a subset of photos by setting the upper bound for cloud coverage to 20%.

Model inference

The next stage is to remove water bodies from the satellite images after the data has been identified. Normally, in order to distinguish various kinds of physical materials on the earth’s surface, such as water bodies, vegetation, snow, and so on, we would need to train a land cover segmentation model from scratch. Starting from scratch to train a model is time and resource intensive which includes data labeling, model training, and deployment. SageMaker geospatial capabilities provide a pre-trained land cover segmentation model. This land cover segmentation model can be run with a simple API call.
Rather than downloading the data to a local machine for inferences, SageMaker does all the heavy lifting for you. In an Earth Observation Job(EOJ), we simply specify the data setup and model configuration.
The satellite image data for the EOJ is automatically downloaded by SageMaker, and prepared for inference. SageMaker then does model inference for the EOJ automatically. The EOJ can take several minutes to many hours to complete, depending on the workload (the amount of images ran through model inference). You can monitor the job status using the get_earth_observation_job function.
Coordinates =[ ] daterange = { ”2018-06-01T00:00:00Z”, ”2019-06-01T23:59:59Z”,} Perform land cover segmentation on images returned from the sentinel dataset. eoj_input_config = { } eoj_config = {”LandCoverSegmentationConfig”: {}} response = sg_client.start_earth_observation_job( ) eoj_arn = response[”Arn”] job_details = sg_client.get_earth_observation_job(Arn=eoj_arn) {k: v for k, v in job_details.items() if k in [”Arn”, ”Status”, ”DurationInSeconds”]}

Analysis  

The results of the EOJ can be exported to an Amazon Simple Storage Service (Amazon S3) bucket using the export earth observation job function. The data in Amazon S3 will be used for a subsequent analysis in order to determine the water surface area. SageMaker also simplifies dataset management. The handling of datasets is made simpler with SageMaker. Instead of crawling thousands of files in the S3 bucket, we can just share the EOJ results using the job ARN. Each EOJ becomes an asset in the data catalog, as results can be grouped by the job ARN.
sagemaker_session = sagemaker.Session() s3_bucket_name = sagemaker_session.default_bucket()  # Replace with your own bucket if needed s3_bucket = session.resource(”s3”).Bucket(s3_bucket_name) prefix = ”eoj_flood_detection”  # Replace with the S3 prefix desired export_bucket_and_key = f”s3://{s3_bucket_name}/{prefix}/” eoj_output_config = {”S3Data”: {”S3Uri”: export_bucket_and_key}} export_response = sg_client.export_earth_observation_job( )
Next, we analyze changes in the water level in Mississippi river. We download the land cover masks to our local instance to calculate water surface area using open-source libraries. SageMaker saves the model outputs in Cloud Optimized GeoTiff (COG) format.
import os from glob import glob import cv2 import numpy as np import tifffile import matplotlib.pyplot as plt from urllib.parse import urlparse from botocore import UNSIGNED from botocore.config import Config # Download land cover masks os.makedirs(mask_dir, exist_ok=True) image_paths = [] for s3_object in s3_bucket.objects.filter(Prefix=prefix).all(): path, filename = os.path.split(s3_object.key) if ”output” in path: mask_name = mask_dir + ”/” + filename s3_bucket.download_file(s3_object.key, mask_name) print(”Downloaded mask: ” + mask_name) # Download source images for visualization for tci_url in tci_urls: url_parts = urlparse(tci_url) img_id = url_parts.path.split(”/”)[-2] tci_download_path = image_dir + ”/” + img_id + ”_TCI.tif” cogs_bucket = session.resource( ”s3”, config=Config(signature_version=UNSIGNED, region_name=”us-west-2”) ).Bucket(url_parts.hostname.split(”.”)[0]) cogs_bucket.download_file(url_parts.path[1:], tci_download_path + ”.1”) print(”Downloaded image: ” + img_id) print(”Downloads complete.”)