Deforestation has reached critical levels globally, with potentially irreversible consequences for environmental sustainability and exacerbation of climate change. Widespread forest fires occur year-round in regions spanning the Amazon rainforest in Brazil to the western United States, indicating severe ecosystem disruption. Deforestation eliminates natural carbon sinks - one of the few remaining mechanisms buffering anthropogenic climate impacts. Sustainability solutions could benefit from incorporation of relevant datasets, whether customer-generated (e.g. building temperatures, vehicle locations) or external (e.g. weather patterns, satellite imagery). Hosting datasets centrally on platforms like AWS facilitates customer access without transfer/storage burdens, allowing faster development. Landsat 8 imagery serves as one example - while petabyte-scale and requiring significant pre-processing, AWS hosts the catalog publicly, reducing barriers to leveraging Earth observation data. Overall, AWS centralized datasets help customers focus on core sustainability applications rather than data wrangling.
The AWS Open Data Program is the umbrella for all our free data sets, and the Amazon Sustainability Data Initiative (ASDI) manages the subset of those related to sustainability. ASDI also enables data-sharing partnerships aimed at addressing sustainability challenges, such as the recently announced collaboration with Open Source-Climate. This organization leverages open data to evaluate and mitigate climate-related financial risks for investors, corporations, and communities. Through such partnerships, ASDI facilitates access to crucial data for developing innovative solutions to pressing sustainability issues.

Let’s run a sagemaker notebook with advanced visualizations

This sample provides an introduction to performing geospatial data analysis related to sustainability on SageMaker Studio. We begin by exploring the Sentinel remote sensing dataset from the AWS Open Data Registry. Using Sentinel-2 data, we analyze deforestation by calculating different spectral indices. As an example, we look at Paradise Fire disaster in California - similar methods can be applied to identify areas of forest loss over time. The goal is to demonstrate core concepts in working with Earth observation data for sustainability applications within the SageMaker environment.
S3 serves as the centralized persistent store for images. SageMaker integrates natively with S3 for data access for training and inference workloads, abstracting away much of the heavy lifting of data transfers. This makes it convenient to use images from S3 efficiently for ML development while persistently storing them in a scalable, durable storage layer like S3.