Interview: Ocado on open source, data and warehouse automation

Ocado's attitude to data is to collect it all and expose everything about the business to drive efficient processes.

"If your system does something, store the data," advises Matt Soane, GM of technology at Ocado. "The benefit is when you store all the data, you visualise everything. So if we want to investigate something from yesterday, we can through visualisation tools."

Ocado uses this data to optimise the business, such as improve product layout in its warehouses, using machine learning algorithms. "Using the big data we've collected, we know how long it takes to pick something and how the orders flow through the building – the software runs on the data. So if we have a fragile product, for instance, we can use that in our data and move it to a different location so it is easier to pick."

While some retailers in the industry believe 'big data' has become a buzzword, it hasn't to Soane. "People interpret it differently," he says. "But for me, it's about the way things have changed and how we are able to [store data] and the cost of storage."

Ocado is using Cassandra's noSQL database to store huge amounts of data. "Ten years ago, this wasn't the case, it was too expensive, so we didn't collect it," explains Soane. "If you are compromising, you're compromising on possibilities."

Now the e-tailer stores data on the 5,000 business interactions which happen every second in the warehouse– down to every subtle movement made by its hundreds of robot pickers, lining 25 km of conveyer belts in its fulfilment centre.

Ocado then runs endless machine learning algorithms and, importantly, ensures the data is available to analysts. "To us in Ocado, big data is very real and not a cliché."

"Traditional relational databases are challenging, because they weren't designed for that type of data," explains Soane. "And you have to compromise when you use a company, like an Oracle, because it's not cost effective and you have to spend so much to build a database big enough."

He adds: "But if you store the data in the right kind of technology, you won't have any difficulty accessing it. It's about an information democracy, making it all available for analysts and computer sciences."

After first starting to use Cassandra in 2010, Soane said Ocado began a bit of a love affair with the open-source platform.

"It did what it said on the tin, consumed data without any problems, if you're writing that number of points a second, you don't want it to slow down your business operation – it's consumed in a blink of an eye," he says. "And with open source, we find that community of developers contributing very valuable, our engineers often understand the technology better because they can look at the code, which is very powerful.

Soane says despite being an on-premise solution, Cassandra allows Ocado to be much more flexible than traditional data warehousing methods, as well as being much more resilient.

Soane has been working with DataStax since the beginning of this love affair with Cassandra, which provides professional support and monitoring tools, with experience dealing with the open-source community. "If people are always contributing, DataStax can tell me the best one and they provide training sessions with data modelling."

Ocado launched its Ocado Smart Platform a couple of years ago and this proprietary solution is used by Morissons and the e-tailer is still looking to close a deal with an international retailer to provide the technology to run an online grocery business. The platform consists of two parts: one has everything a retailer needs to go online – eCommerce systems, order management, supply chain, systems for vans, while the second component is Ocado's automation solution, which includes its warehouse robotics.

"The idea is a retailer can be given the whole lot and very quickly be running," explains Soane. "But there are two parts to this technically, the bit that runs in the cloud and the bit runs on premise in the warehouse."

Cloud technology looks after the platform and systems, while the automation technology runs using on-premise hosting capabilities, which is the reason behind the choice in Cassandra.

And it is this automation available as a service for other retailers which really excites Soane. "There's a whole load of robotics there, pick smarter, quicker, it's really clever technology. Big data is very exciting."

He adds: "But to make it work, you've got to make sure all of your systems are exposing data of any interest. All sorts of wonderful things are possible, you have to put the work in to expose the data. I never stop putting effort to make sure that is happening. Because there are enormous dividends down the line."

For more information, click below: