NUS
 
ISS
 

Less is More: How Small Data can be ‘Bigger’ than Big Data

In 2014, a Buzzfeed article revealed ride-hailing firm Uber’s in-house use of “God View mode”.

Staff at Uber were reportedly able to view the movements of all passengers in real-time – exemplifying the prowess of Big Data, which the business is rooted in.

As such, Uber has always been touted as a Big Data success story. While the company does indeed generate a wealth of information through its network of users, what enabled it to disrupt the taxi industry wasn’t the massive amount of data collected. Rather, it was about being able to use the small, but right, data to deliver the service that people wanted. 

‘Who needs a ride? Which vehicle nearby is able to fulfil that request?’ By answering these simple questions with a minimal dataset, Uber allows users to get to their intended destinations with a tap of their smartphone screen.

Therein lies the value of small data. Allen Bonde, former vice-president of Innovation at Actuate (now part of OpenText), defined small data as “timely, meaningful insights” that are “accessible, understandable and actionable for everyday tasks”. 

All you need is the right data

With data being touted as the “new oil” of the digital age – the hype over Big Data is almost becoming a ‘fetishisation’ of data. Many companies embarking on digital transformation tend to assume that having a lot of data will help them scale and solve their business problems.

big-small-data“But the fact is, Big Data is not a magic wand; there is no one-size-fits-all solution,” said Gu Zhan (Sam), Lecturer & Consultant, Analytics & Intelligent Systems Practice at NUS-ISS.

Moreover, big data useful to business takes a long time to accumulate. Take for example large search engines such as Google and Baidu; they have the resources to create numerous free products that are not monetised, but allow them to gather data that could be monetised elsewhere.  The process of acquiring data is not so straightforward as many might think. Tech giants employ sophisticated, multi-year strategies to collect the data they need,  some of which are industry-specific. 

Sam warned, “But a lot of times,  we see businesses blindly gather large amount of existing digitised data, only to realise that the necessary data to answer their critical business question has not been recorded. Or in some instances, businesses over-invest in low-value data only to realise that it is not useful. This is why it is crucial to involve the Artificial Intelligence (AI) team in the early stages of data acquisition so they can help to prioritise the data to gather.”

That is not to say that Big Data is not useful – its advantage comes from the tremendous size of raw data that can be processed to find common patterns and correlations. Sam raised the example of how Big Data analytics applications have been used effectively to figure out the likelihood of a learner getting promoted after completing a skill-based course. This was done by comparing this learner’s study characteristics with historical learners’ profiles and promotion rate, and so on.

While it doesn’t hurt to have more data, it is not a prerequisite to have many terabytes of data when creating an AI system. In fact, an effective AI system can be built with anywhere from 100 data points (Small Data) to 100,000,000 data points (Big Data).

However, it is important to remember that the statistical generalisation approach follows 80-20 rule – finding insights applicable for the 80 percent majority at the cost of not optimising for the remaining 20 percent (the ‘marginal outliers’). “This cruel ‘Big Data-driven discrimination is becoming a norm, ironically, during a time where businesses are aiming to bring individualised services to customers,” Sam explained.

Small data is people-centric

While Big Data is about algorithms and analysis, Small Data is about connecting people with timely and meaningful insights. To connect and appeal to customers, businesses need to be able to capture the emotional reactions of their users – something that is hard to record in binary. There is always a risk of misinterpreting the patterns shown by Big Data and drawing causal links where there is in fact a mere coincidence.

Small Data contain very specific attributes created by analysing larger sets of data. Despite its size, they are often informative enough to provide solutions to problems and achieve actionable results.

“We can think of Small Data as the kind of knowledge that domain experts have – concise, abstract and domain-specific,” said Sam. “They can be expressed in the form of business rules, best practices, associations, and so on. What makes these knowledge more valuable compared to statistical insights from an indifferent machine is that they embody more ‘human sense’ and are not just cold statistics.”

Small Data focuses on efficiently representing knowledge extracted from raw data

A key advantage of Small Data is that it doesn’t require the use of expensive technological systems which are needed to analyse Big Data. Besides offering a faster onboarding time for companies, it also helps companies to avoid overspending on the technology.

Instead of spending time thinking about how to use Big Data to solve their problems, businesses can instead focus on figuring out what are the right data to seek out.

Ultimately, size is not what matters – as technology is merely a means to an end. What matters is having the right data that can help us solve a problem that we have – and if Small Data can do that, it can be ‘bigger’ than Big Data.

For more information or to sign up for our Stackable Certificate Programme in Artificial Intelligence, click here.

A+
A-
Scrolltop