The fundamental concept behind the Argentina soy model is similar to our corn models—we built district-level estimates and aggregated them up to the national level. The model is also similarly updated daily for Gro users during the growing season. We used many of the same variables, such as:

  • Normalized difference vegetation index (NDVI)
  • Land surface temperature (LST)
  • Soil properties
  • Cropland Masks

The main differences between the models are in how the variables were transformed within the model. For example, for cropland masks, we needed to obtain the data very differently for Argentina soy than for US corn. For the US, we used annual masks from the USDA’s cropland data layer program for all historical years; for the current year, we created a static mask based on historical masks. Because Argentina does not have an equivalent crop mask source, we created our own soybean masks for 2001-2016 by systematically classifying pixels on satellite images.

The machine-learning models selected are also slightly different for the Argentina soy model. The US corn yield model uses XGBoost and Cubist first, and then runs it through an XGBoost metamodel. The Argentina soy model only uses a single XGBoost layer because the number of training instances is much lower. The metamodel was also unnecessary because the years back tested did not include any severely anomalous years. The metamodel therefore did not show different results.

We have made our weekly forecast and commentary during the season available publicly on this website. Gro users can access daily forecasts as well as monitor specific inputs to the model (e.g., weekly NDVI updates, daily temperature shifts). For more technical information, you can download our yield model research paper here.