About the East Africa Cereal Yield Models

Gro’s East Africa Cereal Yield Models, like our other yield models, provide in-season forecasts of crop yields. We built separate models for each of the five main cereal crops in Ethiopia and Kenya for a total of 10 crop/country pairs.

To forecast yields, we use machine learning models driven by inputs reflecting long term trends as well as in-season changes. High resolution satellite data is transformed into robust district-by-district signals uniquely adapted to the situation using domain expertise. The models run daily throughout the growing seasons.

The yield model for each crop/country pair is trained using approximately 1M to 4M data points from the Gro platform, which together represent 10-40 billion points of raw data from 9 different sources, consisting of:

  • Historical annual yields for each crop from up to 3 different sources
  • Land cover area for each crop at the district level
  • 8-day NDVI
  • Daily land surface temperature
  • Daily rainfall anomaly
  • Monthly evapotranspiration anomalies
  • Crop calendars

These yield models take into account:

  • East Africa’s unique crop cycles, with most cereals having two growing seasons and harvests per year
  • The mix of cereal crops grown (model input signals are weighted to reflect differences in crops)
  • Long-term trends and intraseasonal variation, with both timescales modeled
  • Not just the amplitude of different signals but also the time shifts in seasonal patterns (the same input signal occurring early/late in the season makes a difference)

During model training, relevant time series consisting of millions of data points are selected. Selected series are combined with each other to form input signals: for example daily rainfall is combined with corn land cover area at the district level to form a “corn area”-weighted rainfall input signal.

These signals are then combined with regional crop calendar information and a particular mathematical transformation for each, to make a more specific feature: for example “the cumulative rainfall for the part of the 1st growing season that has elapsed to date” is a feature, with a value for every district, on every day of every year. Another example: “the NDVI peak of the part of the 2nd growing season that has elapsed to date occurred on the 20th day” could be a feature for a given district on a given day in history.

The training phase for each model pulls from 1M to 4M data points from Gro, depending on the country and crop. Once a model is trained, to make a prediction, it computes the current value of all features it needs.

We have made these daily forecasts and commentary publicly available on our web app. For more technical information, you can download our yield model research paper here or contact us at intel@gro-intelligence.

The code for the Ethiopia model is accessible here, and the code for the Kenya model is accessible here.

You can see the result of the East Africa Yield Models in the Gro Platform here.