Garbage in, garbage out! Computer scientists like me are fond of emphasizing the importance of good data to achieving valuable outcomes. And it’s true—no matter what you are trying to compute, you won’t get far with bad data. But when it comes to online grocery shopping recommendations, the challenge is not that binary. Achieving good grocery recommendations algorithmically looks less like the black-and-white world of good versus bad data, and more like a gradient of value from “very simple and poor” to “considerably more complex, but truly exceptional.”

In fact, I propose that online grocery recommendations should be viewed as a journey starting at “dumb” and arriving at “wise.” Here’s how that works:

Recommendation-Value Spectrum.png

1. Develop dumb data into apt information

The work of putting together recommendations always begins with raw data. That might include SKU numbers, product names, ingredients, and nutritional information. This work is simple, but the data output you can get from this is “dumb.” It is too raw to be of any real use. To transform this “dumb” data into “smart” (and therefore “good”) data, you need to clean, normalize and standardize it, then tag each piece with as many accurate and useful product qualities that you can.

For example, “Is it a grain, produce, or meat?” “Is it frozen, fresh, boxed, bagged, or canned?” “Is it imported or domestic?” Correctly, completely, and measurably labelling a large body of good data is essential to a robust grocery recommendations system. You must also be mindful of which data is structured and unstructured. This sure adds a bit of work, but it also moves you along the complexity-value journey from “data” to “information.”

But once you have trained your algorithms to recognize that the “information” qualities of products can be clustered together according to the multiple traits that each item has, you can bestow upon the algorithm something like “cheese knowledge.”

2. Knowledge is power; Give your data some!

Ok, but even information alone offers quite limited value. And if you ask me, the real shame of where we are as an industry right now is that this is precisely where most online grocers stop. They think that tagging raw data with various category or trait markers is the best that they can do for their shoppers. The truth is far from that. Instead of calling it quits at isolated bits of product information, a more sophisticated approach is to link that information together in significant ways. Take cheese, for example. Each cheese SKU has simple information qualities like whether it is Gruyere or Cheddar, whether it is yellow or white, whether it is sharp or mild, soft or hard, and whether it is in a wheel, a wedge, a tub, or packaged in slices. You can develop quite a robust ontology of information about cheese alone.

But then connecting and cross-referencing the classifications and qualities of cheeses relative to one-another and to the needs of customers moves you along the complexity-value journey from “information” to something like “real knowledge.” For example, let’s say a shopper purchased hard, yellow, sheep’s milk cheeses in the past. It is a little bit more complex to understand that Havarti and Camembert do not belong in the family of cheeses that this shopper prefers, and that recommending those cheeses is not going to get you very far.

But once you have trained your algorithms to recognize that the “information” qualities of products can be clustered together according to the multiple traits that each item has, you can bestow upon the algorithm something like “cheese knowledge.” It will then know to offer this shopper Manchego or Pecorino, cheeses perfectly in accordance with the shopper’s expressed tastes.

3. Feed your algorithm with a well of wisdom

But there is still further to go on our journey. What if your recommendation engine not only had a deep understanding of food qualities and how they interrelate between products, but it also understood human lifestyle preferences? What if it knew how various ingredients and food products are used together to construct popular dishes? What if it could understand how the priorities of a single college student differ from those of the primary shopper for a family of five? What if it knew about food allergies, and religious or health-oriented dietary restrictions?

This goes beyond simply recognizing the behavior patterns of consumers and looking for simple correlations.

Grasping these more abstract concepts is considerably harder than just labeling SKUs with simple metatags. It requires a large amount of reference information from recipes, menu items, and ingredient lists. It requires understanding cultural and regional dietary preferences. It requires deciding and prioritizing human decision-making heuristics. It requires advanced machine-learning in algorithms, so that they remain responsive to the behavior of real shoppers “in the wild.”

All of that adds up to a lot more complexity! But that complexity also boosts your recommendations way, way, up along the value spectrum journey.

olivia@halla.io

With this kind of engine driving recommendations, you could recognize not only that jam goes well with another spread on bread, but you would also know to suggest pumpkin butter to a shopper whose history suggests nut allergies. When you identify that a shopper is building a fruit salad, you would know to recommend fruits like berries and melons, and not those like tomatoes and zucchinis. You would offer sliced turkey or smoked salmon as substitutes for ham to a kosher shopper making eggs benedict, but only low-sodium smoked salmon to a shopper who is a hypertension-conscious pescatarian. And as for our cheese example, you would know that beef, lettuce, pickles, onions and tomato on a bun go better with sliced emmental than they do with cottage cheese.

This goes beyond simply recognizing the behavior patterns of consumers and looking for simple correlations. At the top end of the complexity-value spectrum, your recommendations algorithm can achieve something that looks an awful lot like what we would consider “wisdom.”

In today’s world of AI-driven e-commerce, garbage in, garbage out is not even table stakes anymore. The winners in online grocery will be those who get furthest along the complexity-value journey. It turns out that tennis great (and once-army-computer-programming-instructor) Arthur Ashe was right…success is a journey!