Gousto ramps up real-time data usage

Meal ingredients subscription service Gousto has added to its technology stack by integrating Databricks’ data engineering, machine learning and analytics to its ever-evolving platform.

The recipe box provider is looking to triple its capacity over the next two years, moving from one UK production site to four by 2022, and it is focused on increasing efficiency and speed of fulfilment.

This year, Gousto has gone from producing 2.5 million boxes a month to five million, with demand for online grocery and subscription services rising significantly as a result of the coronavirus pandemic. Sales were already on the rise before Covid-19 had significantly impacted the UK, with first-quarter revenue at the online player up by 70% year-on-year.

Gousto CTO, Shaun Pearce, said his team is building a “next generation data platform”, supported by Amazon Web Services cloud technology to better serve its customers and to ensure the company’s infrastructure scales with the rise in demand for its products. The use of Databricks’ technology enables the business to act on information instantly, he added.

“We’re moving towards something that’s a bit more home grown, using code and bespoke technology my team has developed, and starting to leverage Databricks,” he explained.

Investing in Databricks takes away the heavy lifting when it comes to the business’s data strategy, he noted. “I don’t want my team of data engineers and data scientists building a platform to store, manipulate and transform data because we can buy that off the shelf.”

Gousto has developed proprietary algorithms that optimise the speed of box packing, which it said maximises pick accuracy, and minimises food waste. With Databricks, it has moved from daily data batch updates to near-real-time streaming data, utilising cloud storage tools AutoLoader and Delta Lake.

“We identified Databricks as the standard in building real-time data platforms,” noted Pearce.

“The platform brings data engineering, data analytics and data science closer together, with a huge potential to make our journey to data strategy more efficient and collaborative. We have reduced the time it takes to develop new ideas from days to minutes and increased the availability and accuracy of our data.”