Gro Intelligence Announces US Soybean Yield Model

Talk to our our team about Gro's offering
Talk to our team
arrow

The First Model

US corn presented itself as the best subject for Gro’s first yield model, mainly due to a large body of relevant scholarship and excellent data availability. Gro modelers considered numerous indicators for inclusion in the model and then winnowed the list down to a manageable number by calculating the relative impact and significance of each one. By the time of the corn model rollout in 2017, the surviving list of input variables contained:

NDVI - Normalized Difference Vegetation Index - an indicator of vegetative greenness determined by satellite. Gro found NDVI to be the most useful variable for forecasting deviation from trend.

LST - Land Surface Temperature - daily satellite estimation of average temperature

TRMM - Tropical Rainfall Measuring Mission - daily satellite rainfall data

Crop condition - reported weekly by USDA during the crop season - subjective estimates of crop conditions rated on a scale from “poor” to “excellent”

Crop calendars - knowing when crops are planted and harvested in different counties across the US allows Gro to weight conditions properly throughout the season.

Planted and harvested acreage - allows Gro to properly weight and aggregate yields upward to the state and national levels.

gSSURGO - Gridded Soil Survey Geographic Database - fixed appraisal of soil attributes across the US. gSSURGO helps to interpret different greenness levels and to understand impact of different weather conditions.

CDL - Cropland Data Layer - a geospatial source of information on locations of different crops.

Further detail on these variables can be found on the Gro website here

The Gro model values started out quite low, sparking some derision on industry bulletin boards and on our own forum. But as the season progressed, the USDA began lowering its yield number. Then after Gro’s estimate rose and eventually exceeded the USDA’s estimate, the USDA independently started raising its number. In fact, each time the USDA lowered its yield, Gro’s forecast was already lower, and each time it raised its yield, Gro’s forecast was already higher. The trading implications are clear.

In order to demonstrate the model’s value to futures traders, Gro devised a very simple system—sell or buy at the closing price the day before the monthly USDA report based on Gro’s yield model position relative to the trade estimates, and reverse the trade at the closing price the day of the report. The simple trading system succeeded despite knowing nothing more than Gro’s corn yield model estimate, both during the backtest period and when live. So far in 2018, traders of this system are up an average 9.5 cents per bushel, having lost 0.75 cents per bushel on 10 May and gaining 10.25 cents per bushel on 12 June.

Argentine Soybeans

Next, Gro attempted to model Argentine soybean yields. The Argentine model would roll out right as the 2018 Southern Hemisphere spring got underway.

Complications became apparent quickly. The lack of reliable county-level data meant that Gro needed to estimate county-level acreage in addition to yield in order to aggregate low-level yield numbers to the state and national levels. Luckily, our internal modelling team has access to Gro. As a result, we were able to construct a soybean crop mask that let us put something together that inspired confidence.

The year started with Gro’s number solidly below the consensus. As the widespread damage from Argentina’s 2018 drought became more apparent and slowly appeared in popular yield estimates, prices rose. Gro remained well below consensus for the remainder of the crop season. When the model stopped running on 30 April, clients who tracked the daily number and remained bullish on soybeans had gained 42 cents per bushel on the May contract.

Gro Intelligence publicly reported both the 2017 US corn yield model and the 2018 Argentine soybean model runs and archived them here.

Buoyed by the successes of the US corn and Argentine soybean yield models, we moved back to focusing on the US for soybeans. Gro’s US soybean yield modeling process ended up looking broadly similar to the one for the US corn model described above.

We found that a 50-year linear trend gave us the best results. Once again, NDVI performed the best as a predictor of deviation from trend yield. Soil moisture variables were also significant, supplementing the greenness sensors that detect soybean health a little less definitively than corn health. In a clear repeat of the corn model experience, we saw the largest average errors in our estimates in counties on the fringes of the belt that produced fewer tons of soybeans.

US Soybean Model

Buoyed by the successes of the US corn and Argentine soybean yield models, we moved back to focusing on the US for soybeans. Gro’s US soybean yield modeling process ended up looking broadly similar to the one for the US corn model described above.

We found that a 50-year linear trend gave us the best results. Once again, NDVI performed the best as a predictor of deviation from trend yield. Soil moisture variables were also significant, supplementing the greenness sensors that detect soybean health a little less definitively than corn health. In a clear repeat of the corn model experience, we saw the largest average errors in our estimates in counties on the fringes of the belt that produced fewer tons of soybeans.

Conclusion

Gro’s expanding suite of yield models can add significant value to any enterprise that needs advance knowledge of upcoming harvest sizes. They have worked historically as stand-alone indicators for trading and marketing decisions and as supplements or sanity checks for existing infrastructure. We strongly encourage all those interested in the 2018 US soybean yield to pay attention to Gro’s estimate and join our yield model forum here with any thoughts or questions.

Get a demo of Gro
Talk to our enterprise sales team or walk through our platform