Information overload

Published: 5-Aug-2002

Keeping track of the vast quantities of information generated by drug discovery is a daunting task. Dr Sarah Houlton looks at possible approaches to data management


Keeping track of the vast quantities of information generated by drug discovery is a daunting task. Dr Sarah Houlton looks at possible approaches to data management

Modern drug discovery techniques generate huge quantities of data. Discovery labs are in grave danger of becoming swamped by a flood of information. And the range of different data types is also huge from the longer established combinatorial libraries and high throughput screening, through to the newer techniques of genetics, genomics and proteomics, not to mention intelligence on competitive research. So keeping track of it all, and exploiting it to the full, is far from straightforward. And how much useful information is lost because it never gets further than someone's brain?

At the moment, information tends to be fragmented and sit in 'silos'. It may be incomplete or outdated, and is frequently not in context. And, more importantly, the sharing of information and knowledge is not a high priority, with companies rarely having a comprehensive or coherent strategy for addressing the knowledge problem. As a result, 'associates spend more than 15% of their time searching for data and information, and this leads to bad and slow decision making,' explained Manuel Peitsch, global head of informatics and knowledge management at Novartis Pharma, at the eyeforpharma knowledge strategies for r&d conference held in London earlier this year.

It is all very well having high-tech, cutting edge drug discovery programmes, but if there is a bottleneck at the data processing stage, it negates many of the potential advantages. 'We have robots to speed up the discovery process,' said Peitsch, 'but what about the data? We create millions of data points in a short time, yet they often have to be transferred manually by floppy disk from one system to another, which adds greatly to the time taken.' He explained that the vision at Novartis is to support the drug r&d strategy by providing comprehensive and reliable data and information, that is set in context, to help turn data into knowledge using in silico science. He added that it was essential that the data were easily accessible, and he thinks that all data being merely 'three clicks' away for operators is a fair vision.

pinpointing potential

Daniel Keesman, executive vp global business at Heidelberg-based Lion Bioscience, is clear about what the most important factors are. He picks out data integration and management and the mining of content that has been created over many years as being essential, as well as integrating systems, automating processes, and supporting and developing decision- making tools through collaborative planning software. 'You need to design and implement knowledge management tools, implement project tracking tools, which include the ways to track the intellectual property you've generated, and possibly implement business intelligence packages. And to do that you need technology architecture.'

Genomics, in particular, has created a massive data problem. Gene chips can give as many as 12,000 known genes, and then there are maybe twice as many unknown genes or gene fragments. The information needed to pinpoint potential drug leads is there the difficulty is extracting it from all the junk surrounding it. One method of identifying useful data is by looking for patterns of change that typify a subset or a whole disease. 'It's like creating a fingerprint for disease X, and subsets A, B, C and D,' said John Morrison, head of global clinical genomics at AstraZeneca. Patterns of change between healthy and disease states can then be seen by comparing them. 'This is a powerful utilisation of data,' he added. 'It's not pinpointing specific targets like proteases, kinases, ion channels or the other drug targets that we look for now. It's utilising everything we have and everything we can generate. We are looking at changes, not just in the gene, but in its partners in the "sickening" pathway, and at genes that are co-expressed in other pathways as well.'

extracting information

To make biological sense of the data, he explained, AZ is using clustering algorithms, multi-dimensional scaling and various other mathematical tools. 'The power of this is that you're not looking at one gene, you're looking at a network of change,' he said. 'And that's the weakness of most biomarkers in clinical studies,' he said. 'They look at just one thing. They take the variability across, say, 1,000 patients, to give a spectrum of normality. But then you're ignoring what other, related things are going on, and I believe this is very important to do.'

On a practical level, AZ has set up a project called e-lab, an electronic laboratory, which is now on the desktop of every scientist within the company. 'This seems to be the most powerful integrated package that yet exists within the pharmaceutical industry,' he claimed. It integrates numerous different data sets and analysis tools, such as genome browsers, search analysis, data mining, and links to external databases and other sources. It also has links to the various research organisations with which AZ has collaborations. This is a very practical way of pulling together all the available data, making it easier to extract useful information.

One of the biggest knowledge management problems is getting people to share what they know. They may subscribe to the 'knowledge is power' theory, keeping as much as they can to themselves to ensure their continued importance. But, equally, they may not know that it is important, and could be useful elsewhere in the business. And with so much information sitting in separate silos, whether computer data repositories or people's brains, the difficulty is establishing what is known, or even whether something has been done before.

As GlaxoSmithKline's Elisabeth Goodman explained at the conference, effective support for global research projects in a decentralised company like GSK is essential for future success. Early last year, GSK rearranged its r&d activities into six centres of excellence for drug discovery (CEDDs), which act as small business units within the larger r&d organisation. The thinking behind this was to capture advantages of scale, while stimulating the company's scientists. However, this leaves the key issue of sharing expertise within and across the different projects to be considered. 'For teams to work effectively,' said Goodman, 'they have to share what they know.'

effective communication

GSK identified the key issues facing this sharing of expertise by working with the company's collaborative computing group and r&d project leaders and managers. They determined that the three main areas that must be addressed to get effective sharing of expertise are:

• People - effective people-people interactions are vitally important;

• Content - effective records of know-how, know-what and know-who;

• Technology - effective use of technology to facilitate people-to-people and people-to-information interactions.

These three areas overlap, and essential to all is the need to have well structured learning interventions. GSK defines learning interventions as a structured approach for examining successes and learnings with team members and stakeholders that will enable teams to build on these learnings during future phases or activities. 'We get the team to reflect and crystallise what they know,' explained Goodman. 'And then they can work out what to do next. We also carry out a "health check", where someone external comes in and gives a diagnosis and suggestions of what to do.'

knowledge management

Goodman added that there are several ways in which cross-team learning can be built. Teams, communities and experts should all be involved in a structured learning intervention process, in order to generate new knowledge and re-use existing knowledge. Shared 'areas' should also be used to facilitate management and access to documented knowledge. And, of course, facilitating people-people interactions, whether by using other methods of communication or collaborative technology, is essential.

Pfizer's chief learning officer, Victor Newman, set out three imperatives for knowledge management:

• How to manage what we already know. 'A lot is rubbish,' he said, 'and a lot may be "commodity" knowledge.'

• How to learn faster than the competition and integrate it into a company's processes. Where does information go, and how can it be mobilised so that it works to a company's advantage?

• How to create new forms of knowledge that will deliver new market value in other words, what knowledge will we require for the new types of decision we are going to make in the future?

However, he is clear that knowledge management should not be regarded as a strategy in its own right, even though a lot of literature may suggest that it is. 'All projects are knowledge management projects,' he claimed. 'Project management is about the deliberate, iterative use of knowledge techniques to deliver specific outcomes. You must explore what has been learned and might have to be learned, and build it into your process.'

He proposed that it is essential to say 'no' to knowledge management projects if they are not being done as value projects. 'You must visualise the value we want to create, identify the knowledge needed to deliver it, leverage that knowledge to win, and then do it again.'

The potential returns from the better handling and sharing of data are huge. Novartis's Peitsch explained that if a company's target productivity is 25 early selected compounds a year, leading to three new chemical entities(NCEs), then every five years, 125 early selected compounds will lead to 15 NCEs. 'Considering that the implementation of knowledge management has the potential to increase research productivity by 5%, then one might expect to get one additional NCE on the market from the same number of early stage compounds in every five year period. Considering NCEs with peak sales of US$600m p.a., we can easily state that knowledge management can contribute an additional US$600m(€600m) p.a. peak sales NCE in every five year period.'

This level of added revenue is a fairly unambitious target, as the potential for identifying better lead compounds through better knowledge management is vast. It clearly proves that there is much more to increasing productivity than merely cutting costs - investing in knowledge management strategies has the potential to pay back its costs, and more.

As Bayer's director of corporate planning, knowledge and information planning, Wolfgang Simon, said: 'Knowledge management will enhance performance, reduce cycle times, beat innovation or create deeper customer relationships. It should be linked to improving the on-going business.' Whether it is in the data-hungry field of data mining, or the more prosaic task of encouraging people to interact, knowledge management has the potential to speed up research and make it more effective. And getting more drugs to the market, and faster, will have a real impact on the company's profits.

You may also like