The technology, codenamed mineLink and developed at the company's research centre in Almaden, uses heuristic techniques to identify data fields which contain related information even though they may be labelled differently. For instance, a field labelled 'Surname' in one database may be labelled as 'First Name' in another, which can cause problems in integrating the data. While that example is fairly simplistic, matching fields often requires complex analysis of their contents, especially if businesses want to drill further into the collected data.
A prototype of mineLink for use in the life sciences field was demonstrated by IBM researchers as long ago as 2002. That project used existing the DiscoveryLink analytic technologies in DB2, but added additional data mining features in order to provide a unified view of complex information.
Although Big Blue hasn't been vocal in promoting the technology, plans for integration into its flagship DB2 database are already well advanced.
"It should be in the DB2 product in the next year or two," said Steve Cousins, senior manager for the user experience research group at Almaden.
That timetable would likely see mineLink elements incorporated in the successor release to Stinger, the next incarnation of DB2, which is currently in beta and expected to be released before the end of the year.
The enterprise database field is now a three-horse race between Oracle, IBM and Microsoft, who account for three-quarters of relational database revenue worldwide. With 39.8 percent of the global market, Oracle was narrowly ahead of IBM (31.3 percent), according to 2003 IDC figures. Open source products such as MySQL don't figure in those totals, since many users download them at no cost.
Angus Kidman travelled to Almaden as a guest of IBM.










