Want to Build a Better Yield Model? Here’s Your First Step
26 April 2019

Accurately forecasting crop yields has broad implications for economic trading, food-production monitoring, and global food security. But creating predictive yield models for many country-crop pairs when reliable acreage data doesn’t exist presents a unique challenge—figuring out what crops are growing where.

Gro Intelligence has built a number of yield models for crops in countries that lack good ground-based data, including Argentina, India, and Ukraine. A critical early step for such models is developing a crop mask to delineate where crops are growing and exactly what the crops are. Crop yield models are valuable because they take a complex set of variables including acreage, weather patterns, and vegetative health, and synthesizes them into an actionable piece of information for a variety of actors across the global supply chain.

These satellite images from USGS Earth Explorer show bare soil before planting, followed by greening once a crop starts to grow. Crop masking captures similar satellite-derived data, but excludes signals from extraneous plants, such as those areas that appear green in the top image before crop planting takes place.

By compiling satellite images of a given area, it’s possible to determine where a specific crop is growing, and just as importantly, where it is not growing. To adequately assess the health of a crop using satellite-derived data, it is necessary to exclude signals that stem from extraneous plants like trees and grass to focus solely on the information gathered from the crop of interest. In areas where reliable acreage data is readily available, this is a relatively painless process. Such is the case in the US where quality products like the USDA Cropland Data Layer (CDL) are robust and easily accessible.

However, identifying and differentiating crop acreage in different parts of the world is more arduous. Gro takes a two-step approach to define cropland boundaries in the absence of reliable data. The first step is identifying the vegetation that follows the typical crop growth cycle of the targeted crop for a given region. For example, if we know that planted wheat begins tillering in a given region over a period of weeks, we can begin to define acreage by looking at greening sections of cropland in satellite images taken for that specific region. The second step attempts to discriminate between the target crop from any other vegetation that shares a similar life cycle.

An example of this methodology was accomplished with the Ukraine winter wheat yield model recently disseminated by Gro. First, Gro consulted a general winter wheat crop calendar to determine when to expect wheat growth. To identify acreage, areas that began greening in the satellite imagery during the expected growth phase of winter wheat were distinguished from areas that were clearly soil during the expected periods of planting. By discerning these locations of overlap, Gro was able to narrow the regional focus to specific parcels of cropland.

From there, it’s necessary to distinguish winter wheat from other vegetation with a similar growth cycle. If ground-truthed data is not available to determine what crops are in a specific location, then an analog is used in other areas that do have reliable data. Using CDL data in the US, along with other published journal articles, specific characteristics of winter wheat were isolated using satellite imagery, using the visible, near-infrared, shortwave-infrared, and radar information, when available (specifically Synthetic Aperture Radar, or SAR). This approach was applied to extract crop masks for winter wheat in Ukraine from 2001 to 2018. Once individual masks were completed for each year, they were used to create confidence masks to be used in the yield models. A confidence mask with a high degree of certainty was made for areas that have only been planted with winter wheat for every year in the archive. A separate confidence mask with only a moderate degree of certainty was made for any areas that have been planted with winter wheat at any time in those years. These confidence masks are then used in the yield models to mask out regions that would not apply when calculating vegetative health and environmental data such as normalized difference vegetation index (NDVI), land surface temperature, and rainfall.

For a more detailed discussion of crop masking, see this publication by one of Gro's scientists describing the methodology.

Gro’s yield model for Ukraine wheat is currently generating estimates for this year’s crop. Provinces that appear in deeper shades of red are expected to have the highest yields.

Global agriculture data at your fingertips

Want to learn more?

Request a demo

Receive our research in your inbox


Thank you for subscribing to our newsletter!

Contact sales