THE DATA EXPLOSION Scientists come to grips with Big Data’s pitfalls and promise
The ability to manage large quantities of data at each stage of the discovery, development and commercialization cycle is critical to success, and analysis tools are helping researchers transform data into knowledge.
Morten Meldgaard and Kaare Buch Petersen live in a small country, but they have conquered Big Data.
Both work at Chr. Hansen, a Danish supplier of bioscience ingredients for the food and health industries. Meldgaard is the program manager tasked with improving how the 140-year-old company manages its data explosion; Petersen is an Information Technology (IT) specialist.
The two men have leapfrogged larger industry rivals by turning to cloud-based data-crunching solutions. Total costs per month for infrastructure and software as a service: about US$1,000.
Before the duo’s breakthrough, Chr. Hansen’s scientists had to manually analyze a compound, then capture their results on paper. That system worked for decades, until a flood of data from new sources.
First, Chr. Hansen adopted an electronic laboratory notebook system, collecting and sharing data among its scientists. Then tools powerful enough to conduct up to 500 simultaneous analyses for a single compound became available. Implementing robots in the laboratory, plus a process that enables 500 times more trials, made data flood in.
“The researchers knew they had a challenge,” Meldgaard said. “They were producing more data and the data were more complex.”
The data explosion is a major challenge throughout chemistry and biology, and research data is just the beginning. Other functions, including development, regulatory review, manufacturing and distribution, generate their own mountains of information.
“People are now overwhelmed with lots of data,” said Alan S. Louie, research director at IDC Health Insights, in Framingham, Massachusetts (USA). Storing data is a challenge, but “the ability to process that data and formulate it into coherent theories is much harder,” Louie said. Each stage of development and commercialization depends on data from the phase that comes before it, and each generates data that needs to be fed back upstream to refine future activities.
Cloud computing can break free of the clutter. The cloud model also simplifies collaboration as research companies establish partnerships with specialized research or manufacturing companies.
“The cloud is quite ideal,” said Andrew Brosnan, an analyst for UK-based research firm Ovum. “It is easily scalable and easily extendable. If you have a two-year project with a contract research organization in Switzerland, you can extend the IT environment to that collaborator. That’s the trend we’re moving toward.”
At Chr. Hansen, Meldgaard and Petersen also discovered that going to the cloud would be less expensive than dedicated infrastructure. “We could have spent US$100,000 or more,” Petersen said. “By relying on the cloud, we are spending US$1,000 a month."
Chr. Hansen’s approach allows scientists to quickly identify patterns in their data. “They saw our solution as a magic wand that could save them a lot of time,” Petersen said.
Many companies suffer from the “dark data” syndrome, where useful data exists but can’t be searched for re-use or made broadly available to internal constituencies. But Chr. Hansen’s new tools have changed the company’s entire approach to storing and accessing data.
“We want to explore the data rather than just looking at the data in siloes,” Meldgaard said. “Big Data storage seems almost perfect for us. We can store it, and afterward start playing around with it and interpreting it. Traditional databases demanded that we pre-decide how we wanted the data to look and how it would be interpreted.”
Ease of access is powerful. “The scientists can go in there, get their hands on it and try out their ideas,” Meldgaard said. “They can pull out data and play with it and test their theories or visualize the data and the patterns.”
The next frontier for Chr. Hansen is to extend the system into other parts of the company. Product developers want access to production data, and vice versa; the sales, finance and legal departments want to be connected too.
The company’s ultimate dream is to tap into what consumers of end products say on Facebook or Twitter about a particular yogurt flavor, for example, making the data available to the scientists who are trying to improve products. ◆Back to top