Traditional LULC maps require significant human input, which takes time and involves processing large amounts of data, and they’re often out of date by the time they’re published. To be distinct from existing LULC maps, Impact Observatory’s model needed to be more accurate in near real time. To accomplish this goal and meet its technology objectives, Impact Observatory turned to Microsoft Azure and Microsoft AI for Earth, sharing a passion for combining technology and sustainability. The company now routinely makes maps faster than satellites can collect images, which is a testament to the processing speed and scale of what is now the highest-resolution global map ever publicly released.
Using AI to map a better future
How can we make thoughtful land-planning decisions if we don’t have a clear idea of how land is currently being used and how that use is changing around the world? Access to near real-time land use and land cover (LULC) maps is critical for a sustainable future, which is why Washington, D.C.–based technology startup Impact Observatory is working to transform automated LULC map making with AI-driven algorithms and datasets.
“This first-of-its-kind global map couldn’t have been created in such a short time without Azure HPC + AI and cloud computing.”
Caitlin Kontgis, Head of Science, Impact Observatory
Decision makers ranging from government leaders to industry and finance executives use LULC maps to understand their impacts on the world. Under increasing pressure from environmentally conscious investors and regulators, responsible companies are planning to mitigate their climate change impacts, demonstrate how they’re contributing to biodiversity preservation, and help governments and local communities achieve sustainable livelihoods.
Traditional LULC maps require significant human input, which takes time and involves processing large amounts of data, and they’re often out of date by the time they’re published. For example, the US national map (NLCD) is based on satellite images of the Earth that are from two to three years old by the time the map is released. Most countries don’t have access to maps of that kind, and current global maps produced by the science community, such as NASA and ESA, are too low resolution for many planning decisions.
As land use changes accelerate dramatically because of human causes and human-accelerated natural processes, updating land maps is critical for data-driven decision making. The US Forest Service, for example, typically catalogs forest inventory every five years, leaving fire marshals struggling to keep up with new land development that isn’t on the maps they use. This also adds time and unnecessary risk to their work. As addressing and tackling climate change becomes ever more urgent, governments, organizations, and individuals alike can’t afford to miss valuable insights.
Impact Observatory is at the forefront of the movement for change. Impact Observatory’s core team met at the National Geographic Society, where they created a first-of-its-kind AI training dataset that was composed of billions of hand-labeled pixels of Earth observation satellite imagery. “One of the biggest things that’s been missing in the field has been this type of training dataset, which is designed for environmental monitoring and analysis at a global scale,” says Steven Brumby, Chief Executive Officer of Impact Observatory.
Accurate and timely land use maps make it easier for governments, NGOs, industry and finance executives, and a wide variety of other civil society groups to make better-informed decisions and show how they can contribute to solutions.
Generating high value with high-performance computing
To be distinct from existing LULC maps, Impact Observatory’s model needed to be more accurate in near real time. The company also wanted to produce more granular maps at a resolution of 10-square meters instead of the industry standard 100- or 250-square meters. To accomplish this goal and meet its technology objectives, Impact Observatory turned to Microsoft Azure and Microsoft AI for Earth, sharing a passion for combining technology and sustainability.
Impact Observatory decided to take advantage of Azure HPC + AI to extend the revolutionary Esri 2020 Land Cover map into a time series of annual maps that reveal important changes, bringing its large-scale dataset, raw pixels, and technology together to create an accessible visual resource. Beginning in January 2022, annual LULC maps for 2017 through 2020 will all be available on the Microsoft Planetary Computer. The company selected Azure to access a complete set of computing, networking, and storage resources that are optimized to manage high processing power volumes, intelligent decision making, and overall scalability with machine learning.
“This first-of-its-kind global map couldn’t have been created in such a short time without Azure HPC + AI and cloud computing,” says Caitlin Kontgis, Head of Science at Impact Observatory. “And it will continue to improve as more novel datasets are included from different sensors, commercial satellites, and public satellites.”
Using Intel-based Azure VMs to maximize processing power and efficiency
In preparation for deploying its mapping model globally, Impact Observatory ran experiments upfront to determine which Azure virtual machines (VMs) would provide the right balance of CPU, RAM, and network bandwidth while boosting efficiency. The company trained its models on NC-series VMs powered by GPUs, and the data was then processed using Azure D4v2 VMs powered by Intel Xeon scalable processors via Azure Batch.
“Out of all the testing, we found that the Azure D4v2 VMs with Intel processors gave us all the resources we needed, including CPU and memory, to complete our map processing tasks efficiently and at reasonable costs,” says Zoe Statman-Weil, Data Engineer at Impact Observatory.
Wanting to seamlessly deploy its model at scale, Impact Observatory bundled its code in Docker containers and strategically used low-priority Azure spot virtual machines to decrease costs and add low-demand times over thousands of VMs. “By running the model on satellite data spanning the entire year, we’re able to produce the annual map in the most reliable way possible,” says Mark Mathis, Head of Engineering at Impact Observatory. “Our testing showed that low-priority nodes allow us to use a lot of them, and using CPUs for deployment makes those more likely to be available in larger numbers when we actually run.”
On average, the company processes 21 images for every location on the planet—amounting to more than 450,000 Sentinel-2 scenes and 500 terabytes of satellite data collected in a year—with everything compressed into a final map that’s about 60 gigabytes. This end product requires 1,000 D4v2 VMs running in parallel and more than 1 million core hours, but the company now manages this entire process with ease and without disruption.
To support future growth and further streamline deployment, Impact Observatory selected Azure Blob Storage to serve as a highly secure and almost infinitely scalable distributed file system. “Thanks to Batch, Blob Storage, and Azure HPC + AI, we now have a scalable and efficient model that we’re confident in and have deployed all over the globe,” says Statman-Weil.
Mapping the world at unprecedented scale in less than a week
Considering all that Impact Observatory accomplished with the Esri 2020 Land Cover map and the amount of data involved, it seems almost unbelievable that it could take only a few months to get everything production-ready for a map of the Earth to be made in less than a week. The company now routinely makes maps faster than satellites can collect images, which is a testament to the processing speed and scale of what is now the highest-resolution global map ever publicly released.
The company credits much of its map’s technological prowess to Azure, including Batch and Azure HPC + AI, and taking advantage of the open science data hosted on the Planetary Computer. “I think if we weren’t running on Azure HPC + AI and getting that level of cloud computing power, this just wouldn’t have been possible,” says Mathis.
The biggest value of quickly processing a wealth of data is the timely insights that the map can provide. Impact Observatory will release a full 2021 LULC map in January 2022, avoiding previous multiple-year delays between data collection and production due to the amount of manual input and intervention required in production.
“Especially in rapidly changing areas where urbanization or deforestation is happening, you’re not currently capturing any of that change until it has already occurred and increased in speed,” says Kontgis. “Thanks to Azure HPC + AI, we’re now able to have those insights, which opens up a whole new world in mapping and looking at land cover.”
For some on the team, the biggest joy of the refreshed LULC map rollout was sharing the benefits with fellow scientists and geospatial industry leaders. “I followed along on Twitter for the day the 2020 map was released, and the amount of engagement we got and the excitement around it was really mind-blowing to me,” says Statman-Weil. “It made me grateful that folks were really going to use our product and incorporate it into their company or lab’s work.”
In addition to enjoying improved flexibility and cost savings, Impact Observatory team members now spend less time on manual tasks and can dedicate resources and capacity to strategic tasks, including planning for what’s next. “We’ve reached a point where a small startup in its first year is using all of this incredible technology to produce a global map in record time,” says Brumby. “There’s going to be a lot more relevant data available to decision makers in the next few years that will hopefully help guide the systemic change in thinking and planning that’s needed to help our planet.”
Learn more about Impact Observatory on Twitter and LinkedIn, and follow Intel HPC on Twitter.
“Out of all the testing, we found that the Azure D4v2 VMs with Intel processors gave us all the resources we needed, including CPU and memory, to complete our map processing tasks efficiently and at reasonable costs.”
Zoe Statman-Weil, Data Engineer, Impact Observatory
Follow Microsoft