PlanetScope image of Iguazú National Park, Brazil. © 2024 Planet Labs PBC. All Rights Reserved.
AUTHOR PROFILE Planet Labs PBC
Curious Planeteer working to make the Earth's changes visible, accessible and actionable.

AI-enabled Insights from Publications Leveraging Planet Data

Stories

Planet AI Symposium

Curious to learn more on how the combination of AI and Planet’s satellite data is helping to build a more sustainable and resilient world? Join us on, Thursday, January 30th for the Planet AI Symposium where industry experts, Planet leaders, and partners explore how AI is advancing Earth observation, enabling data-driven decision-making and delivering innovative solutions to address peace and security, sustainability and regulation, and digitalization.


By: Vinicius Perin, Seamus Lombardo, and Ash Hoover

Planet’s Education and Research (E&R) Program works to turbocharge advances in knowledge and science by providing university-affiliated students, faculty, and researchers access to Planet data. Opportunities in the Planet E&R Program range from our Basic E&R package to campus-wide data licenses. The E&R Program also directly supports researchers through opportunities such as Taylor Geospatial Institute (TGI) Planet Fellowship program, which seeks to help improve the research of TGI students through the use of Planet datasets to address grand societal challenges and develop the next generation of scientific leaders.

Driven in large part by the E&R Program, thousands of research publications have been produced using Planet data. Since 2016, over 3,600 scientific publications using Planet data have been published, showcasing Planet data’s widespread impact across a variety of disciplines in both industry and academia. 

These publications underscore the utility and influence of Planet data and offer insights into how, when, and where Planet data is being applied. In addition, the novel research developed in these publications opens new frontiers of exploration and enables research previously out of reach. 

The constant stream of new research makes digesting these insights a significant challenge—just last year, two new manuscripts were published every day. To help synthesize the vast amounts of published research, we experimented with a large language model (LLM) to explore whether cutting-edge AI technologies can assist in extracting insights on Planet data usage and applications from our growing publication database. 

We focused on identifying the contexts (e.g., agriculture, forestry, environmental monitoring) in which the data is applied, the most frequently used data products, and the geographies of both the study authors and their study regions. As researchers drive new discoveries with Planet data and share through scientific publications, we expect AI tools to accelerate insights and enhance collaboration across disciplines.

Extracting insights from publications

We tested the potential of one leading AI model to summarize and extract insights from scientific manuscripts employing Planet imagery. For this analysis, we downloaded 96 open-access manuscripts and applied specific prompts within the LLM API to guide our analysis. Each answer to a given prompt was manually validated across all manuscripts to check the information accuracy. The answers were classified as correct, wrong, and not available. After validating the answers, we summarized the results and presented the information in the graphs below.

Prompts example:

What is the title of the manuscript?

What is the 1st author affiliation?

Where is the study region?

Classify this manuscript into one of the following categories: Agriculture, Urban Planning, Forestry, Environmental Monitoring, Climate, Conservation, Ecology, Energy, Land Use, Mining, Water. If the manuscript is a mix of categories, choose the most relevant one. If the manuscript does not fit any of the categories, choose ‘Other’.

How did Planet’s data help the authors in this manuscript?

Validation confirms over 90% accuracy

The model correctly extracted answers for most of the questions and manuscripts (Fig. A), with 90% accuracy for most of the manuscripts. Shown as orange in Fig. A, the highest error rate (4.2%) occurred when identifying the type of product used (e.g., PlanetScope Imagery, Basemaps, Fusion). This challenge arises because authors may refer to Planet Basemaps simply as “Planet Imagery” or use different terms for PlanetScope Imagery, such as “Dove Imagery,” “CubeSats Imagery,” or “SuperDove Imagery.” Although all these variations are correct, the model struggled to recognize these nuances.

The gray portions of the plot also demonstrate that even upon manual reading of the application by a human, some publications did not provide the necessary information to answer a given prompt. For example, several papers (32.3%) did not specify the way that they accessed Planet data.

Figure A–Validation results for each question (see prompts example) and manuscripts. The answers were classified as wrong, correct, and not available. Area-of-Interest (AOI*).

In what contexts was Planet imagery applied, and what were the most frequently used spectral bands? 

More than 60% of the manuscripts focused on topics related to agriculture, environmental monitoring, and forestry (Fig. B, pie chart). In agriculture, the research varied from yield prediction (Tunca et al., 2023, Farmonov et al., 2022), to biomass estimation (Giora et al., 2022, Gargiulo et al., 2020), monitor crop phenology (Nieto et al., 2022), and assessing agricultural practices (Luo et al., 2023, Navarro et al., 2023). Environmental monitoring studies included estimating carbon emissions (Dadap et al., 2021) and monitoring green tide in the Yellow Sea (Shang et al., 2023). Forestry-focused manuscripts ranged from assessing tree species diversity (Njomaba et al., 2024), and canopy-scale tree mortality (Dixon et al., 2023), to evaluate land use changes following deforestation (Masolele et al., 2023).

The most commonly used spectral band combinations were RGBN (Red, Green, Blue, and NIR) and RGB (Red, Green, and Blue) (Fig. B, bar chart). RGBN was primarily utilized when pixel radiometric values were essential for analyses, such as in sensor calibration tasks (Ichikawa et al., 2022), and time series analysis of different spectral indexes (e.g. NDVI) (Gargiulo et al., 2023, Masolele et al., 2023, Keay et al., 2023). Meanwhile, authors focusing on RGB bands primarily aimed at computer vision and object detection/identification (e.g., Francini et al., 2020, Oner et al., 2020, Wang et al., 2022). Interestingly, most studies utilizing 8-band data did not leverage the additional bands (coastal blue, green-1, yellow, and red-edge) for specific purposes. Instead, these bands served as additional explanatory variables in classification, prediction, and machine learning tasks. For instance, they were used to help estimate soil salinity (Tan et al., 2023), predict soybean yield (Amankulova et al., 2023), map allergenic tree species in urban areas (Gašparović et al., 2023), and assess grapevine water status (Wei et al., 2023).

Figure B–The bar chart displays the frequency with which different spectral bands and their combinations were utilized in the 96 manuscripts. Note that some manuscripts employed multiple datasets and different band combinations (e.g., RGBN and Panchromatic), resulting in a total count that exceeds 96. The pie chart illustrates the distribution and percentages of the various contexts in which the manuscripts were applied. Other includes a single band, or a combination of two bands only (Green and NIR).

Where are the authors and their study regions?

Figure C highlights the countries of first authors’ affiliations and their respective study regions. The gray lines connect the author’s country to their study region, noting that each study may have more than one study region. Most studies have their first authors affiliated with institutions in the US—the largest bar in Fig. C. The oval ring shape indicates that the author’s affiliation country is the same as the study region. For example, nearly half of the studies with first authors from the US have study regions within the US. Conversely, only a few E&R studies from Brazil, the Netherlands, Switzerland, and Germany focus on regions in the US. 

Interestingly, studies authored by researchers in the Global North countries often focus on regions in the Global South—see the gray lines for Germany, Denmark, Netherlands and Switzerland. In contrast, with a few exceptions, studies authored by researchers from the Global South countries primarily focus on their own regions. For example, see the gray lines from Brazil, Colombia, and Indonesia. The global map complementing Fig. C is shown on Fig. D.

Figure C–A Sankey diagram highlighting the first author affiliation country and their study regions. The height of the bars represents the number of different studies within each specific country. The gray lines connect the author affiliations’ countries to their study regions. When the grey lines form an oval ring, it indicates that the author’s country is the same as the study region.
Figure D–The global distribution of the first author affiliation location and their study regions.

Stepping forward into the AI era

Authors utilized Planet data primarily due to its temporal and spatial resolution, enabling precise and timely analyses across a range of applications; including yield prediction, crop health assessment, individual tree loss detection, flood dynamics monitoring, and water resource management. Our experiment with a large language model has shown great promise in its ability to synthesize a considerable number of scientific publications. This approach has proven particularly useful in identifying key trends, such as the diverse applications of Planet data, the most commonly used products and spectral bands, and the geographic and institutional origins of the research.

The E&R Program plays a crucial role in expanding access to Planet data, enabling a broader range of academic research and fostering innovation across diverse fields. As the volume of publications continues to grow, leveraging cutting-edge AI can significantly enhance our ability to extract and interpret valuable insights from this expanding body of work, providing inspiration for future research and significant impact.