CSIRO work tackles enterprise data mountains

CSIRO's new terabyte science project is aimed at helping science and business cope with ever-growing masses of data.

The world is generating "vast amounts of data, but people don't know how to extract information from [it]", Dr John Taylor, leader of the terabyte science project at CSIRO (Commonwealth Scientific and Industrial Research Organisation), told ZDNet Australia. The organisation is now working on a project that will develop completely new mathematical approaches and processes to deal with such data mountains, he added.

Taylor said businesses will benefit from CSIRO's work on processing large sets of data by gaining new abilities to analyse information produced by data-intensive processes, such as those that surround the use of RFID tags. With the track-and-trace tags set to be embedded in "just about everything", Taylor said, more businesses will find themselves in need of such analysis tools as they tracking the chips and detecting patterns in their movement.

Sequencing of the human genome could also be aided by CSIRO's data analysis work: since automatic genome sequencing machines were developed, data began "flooding in", according to Taylor.

Taylor said that until now, the world has been happy working on small data sets. He continued that if successful, CSIRO's work will cause a "step change in the way we do science" and could lead to the "potential for huge new science discoveries".

Taylor's team has already been considering new algorithms for the square kilometre array, which is an international project to develop a next-generation radio telescope capable of exploring the origins of the universe which will produce "terabytes an hour of data", according to Taylor.

He said that the data analysis methods currently in use will not work on projects such as the square kilometre array, because although they work for small datasets, they are unable to be scaled up for larger ones.

The current methods "won't be able to compute the answers in a reasonable amount of time", said Taylor, since the "computational cost of an algorithm rises as a square of the data points".

To be able to deal with larger data sets, Taylor said it is necessary to consciously acknowledge the problem of scale, and find new mathematical methods to deal with it.

He hopes to build up a generic community of knowledge of algorithms for large data sets by working on projects such as the square kilometre array and that a significant portion of what his team develops for individual projects will be applicable to a wide range of problems across science and business.

Advertisement

Talkback 0 comments


ZDNet's CIO Vision Series

Customs | Murray Harrison, CIO

Australian Customs CIO Murray Harrison dislikes SLAs and runs away if a vendor talks to him about innovation. In this interview, he also explains why getting excited about gadgets can be dangerous and talks about how Customs' outsourcing strategy has evolved.

Sponsored content

Power Centre - Content from our premier sponsors

Blogs

  • Munir Kotadia iPhone suckers test our patience
    So how many of you have bought a 3G iPhone? Do you feel like a sucker? If you don't, maybe you will once your first bill arrives.
  • Array Westpac bank: AVG's toughest competitor
    The next time you're buying antivirus software, don't go direct to Symantec or McAfee. Don't download free antivirus. And definitely don't see Harvey Norman. Ask your bank — they're quite literally giving the stuff away.
  • Array Will you manage in the exabyte era?
    Mammoth growth in storage volumes is a fact of life, but even so it's helpful to pause occasionally and try and work out whether our information strategies have fallen hopelessly out of step with the pace of technological growth and changes in costs.
  • More blogs »

Tags

Back to top

Featured