From Satellite Data to Actionable Insights: Predicting Hay Yield for North Dakota with Planetary Variables
Authors: Thomas Frederikse & Stephanie Gijsbers
The ability to predict crop yields is essential for optimizing agricultural productivity, managing food security, and streamlining supply chain operations. Traditional forecasting methods often struggle with limited and inconsistent data, making it difficult to establish reliable long-term correlations. Satellite data, however, provides a baseline of reliable and near-daily data to measure ground conditions that influence crop growth.
Planetary Variables are built on Planet’s robust, global, and high resolution satellite constellations to provide analysis-ready insight into a multitude of crucial ground conditions, including Soil Water Content (SWC), Land Surface Temperature (LST), and Vegetation Optical Depth (VOD). These near-real-time measurements can be directly used to predict crop growth and yield. SWC is a direct measurement of the amount of water available to plants, while LST provides insight into the heat stress affecting plant growth. VOD is an indicator of the amount of water in the vegetation, and reflects plant health.
To move from high-level data to actionable insights, we’ve created an example Jupyter notebook to investigate SWC, LST, and VOD as potential hay-yield predictors in North Dakota.
In the notebook, we first examined the relationship between annual hay yield and key variables SWC, LST and VOD (averaged per year), see figure 2 and 3. The results showed a strong correlation between higher SWC and increased hay yield. For instance, during the drought years of 2017 and 2021, low SWC levels clearly coincide with significantly reduced yields. However, SWC alone did not account for all variations, as similar SWC levels in 2017 and 2021 coincided with notably lower yields in 2021. Further analysis revealed that heat stress, captured by LST, played a significant role in 2021, where elevated LST values exacerbated the drought’s impact. Additionally, VOD trends closely mirrored hay yield, underscoring its utility as a key indicator of plant health and productivity. These relationships are visually depicted in Figures 2 and 3, demonstrating the influence of SWC, LST, and VOD on hay yield.
We now have three variables that show a very similar pattern as the yield: SWC, LST and VOD. To quantify the relationship between these variables and hay yield, we calculated correlation coefficients. The stronger the correlation between two time series, the more similar their patterns. For our analysis, we used both Pearson and Spearman correlation coefficients.
The high positive Pearson and Spearman correlations for SWC (0.853 and 0.800, respectively) indicate a strong link between soil water content and hay yield. LST shows a negative correlation with yield, suggesting that higher heat stress (as captured by LST) is detrimental to crop growth, especially during drought periods. VOD, while correlated with yield, shows a weaker relationship compared to SWC.
Then we go a step further by showing how to build a model that combines all Planetary Variables to see how they can predict yield. To build the model, we first collected and cleaned the historical data for LST, SWC, VOD and hay yield from 2013 to 2023. We averaged the data per year and made sure it was consistent for analysis. We also adjusted the variables to account for seasonal differences to help the model learn better. We started with a linear regression model as a baseline, but also tested more complex models like random forests and gradient boosting. The model was trained on the historical data (except for a part used for validation). After refining the model by tuning its parameters, we used it to predict the 2024 hay yield (for which we don’t have the yield yet) .
While our analysis helps us better understand how these variables can predict yields and improve yield models, and serves as a starting point for future testing to increase the reliability of the regression. For those interested in exploring this methodology further or experimenting with the data themselves, we invite you to check out the Jupyter notebook we created, which details the full analysis. If you are interested in testing our data, please don’t hesitate to reach out!
About Planet Labs PBC
Planet is a leading provider of global, daily satellite imagery and geospatial solutions. Planet is driven by a mission to image the world every day, and make change visible, accessible and actionable. Founded in 2010 by three NASA scientists, Planet designs, builds, and operates the largest Earth observation fleet of imaging satellites. Planet provides mission-critical data, advanced insights, and software solutions to over 1,000 customers, comprising the world’s leading agriculture, forestry, intelligence, education and finance companies and government agencies, enabling users to simply and effectively derive unique value from satellite imagery. Planet is a public benefit corporation listed on the New York Stock Exchange as PL. To learn more visit www.planet.com and follow us on X (formerly Twitter).