Because of the increasing importance of India’s agricultural sector, Gro Intelligence in 2017 decided to build a machine-learning-based yield model for the country’s winter wheat crop. It follows the process set forward in our successful US corn and Argentina soybean yield models. The India wheat model began generating forecasts in January 2019 for the crop to be harvested the following March and April.

A challenge in building the India wheat model was a lack of good, objective acreage data of where the wheat grows. This information is needed to construct a crop distribution map, or mask, in order to use our satellite data to estimate the crop’s health and size. We faced a similar problem in creating our earlier Argentina soybean model and solved it by estimating the crop’s distribution using the timing of its planting and growth. Given that knowledge, dated satellite greenness data indicated which crop was planted where.

The next step was to test various environmental metrics at the district, or county, level to see which ones best “predicted” the known final crop yields as we backtested previous crop years. Normalized difference vegetation index (NDVI) and evapotranspiration (ET) anomalies provided the best insights, so those became the main predictors in the final model. The model also includes soil type and moisture data from multiple depth levels.

The India wheat model results, which were trained against district-level wheat-yield data from the India Department of Agriculture & Cooperation (IDAC), meet our criteria for reliably forecasting yields. Performance-wise, by the end of the harvest season, 72% of India’s districts were predicted within 0.48 tonnes per hectare. At the country level Gro predicted wheat yield within a 5% error rate, on average, from 2001 to 2016.

Backtesting showed that as the season progresses, the India wheat model’s output converges to arrive at the actual result. This is as it should be, given the increasing amount of information available from satellite photos. What’s more, similar to what we saw in our earlier yield models, Gro’s machine-learning algorithm achieves higher levels of accuracy in regions that are most important to wheat production.

We have made our weekly forecast and commentary during the season available publicly on this website. Gro users can access daily forecasts as well as monitor specific inputs to the model (e.g., weekly NDVI updates, daily temperature shifts). For more technical information, you can download our yield model research paper here.


India winter wheat production concentration by district (left). India final yield versus Gro yield model estimates (right). Gro’s model performs best in India’s major winter wheat-producing districts.