Approaching the Challenge

A greenfield approach

We've been at this since 2014, focusing intently on climate and food security problems and how to articulate and present the solution. In January of 2021 we raised our Series B and are now building to solve the problem at scale. Scale in terms of users; scale in terms of volume of data; scale in terms of enabling a platform to be concurrently expanded by a rapidly-growing organization.

Across huge infrastructure challenges

Multi-dimensional scaling

With an ever-growing platform, we need to be cognizant of the true costs of the various options available to us (e.g. buy vs build, even in a strictly public=cloud context)

Rapid expansion of data and data sources which affects the data pipeline, indexing and serving
Increase in the number of users and queries against our platform
We’ve more than doubled the number of employees in 2021, which means more concurrent changes to the platform

Versatile data modeling

We need to be creative about organizing a wide variety of data and data formats that don’t obviously fit together in a seamless way

Static vs dynamic (e.g. computed on the fly) data
Tabular, textual (e.g. prose), and raster (image) data
Historical vs future (modeled) data
Single origin vs ensemble data

Embracing the rise of Rust

Almost all of our code was originally written in Python. Python is a great language in terms of its ease-of-use and expressiveness but it’s not well-suited for high-performance computing. We won’t be moving away from Python at the edges (e.g. for some of our data scientists and rapid prototypers) but we need to shift to something more performant for the inner nodes and foundational layers.

Why Rust?

Given the nature of the technical challenges at Gro (i.e. Big Data) and that we had the luxury of choosing a language after the initial end-to-end platform was built out, the choice for us was clear.

Plays nicely with Python both as the caller and the callee
Designed from the onset to be low-overhead and a viable competitor of C/C++ with modern features
Explicit data ownership concepts and lack of garbage collection yield remarkable performance
Compile-time problems are always preferable to run-time problems