There's gold in them thar databases

By David Braue
07 August 2003 01:10 PM
Tags: data, technology, business, olap, data mining, database, intelligence

Getting data mining to the users

Increasing delineation of the functionality of data mining solutions has had another beneficial effect: it's made development kits easier to bundle into discrete functional units. Vendors have been improving the accessibility of data mining solutions: latest toolkit revisions, many of them Java-based, allow enterprise developers to easily integrate each platform's data mining features into in-house applications.

This flexibility is a major improvement over the esoteric and complex interfaces of previous systems, a change that should ease development of analytics portals serving the needs of specific user communities within the business. Rather than existing as complex applications used by a few technical analysts, integrating analytics into a general-purpose portal can easily put powerful analysis tools at the hands of far more employees than ever.

Just remember: although employees will no doubt benefit from better information, they can also drown if they're getting too much of it. "The interface is easy, but we have to decide the right level of exposure to give to users," says Colin Shearer, vice president of analytics with SPSS. "They don't want to see detailed statistics about the predictive accuracy of the model; they want things they can immediately interpret within their own sphere of knowledge."

Making this happen, of course, requires close collaboration between technical and business units so that the developed applications reflect the idiosyncrasies of each company's business. It's also important to develop consistent business rules so that all is not left to chance: rules, enforceable through systems such as CA CleverPath's Business Rules Engine, ensure consistent results and provide auditability if there are ever questions about the mechanisms by which numbers are derived.

"The way in which data mining technology is being incarnated is changing quite dramatically," says Oracle's Slee. "Instead of being a specialist activity performed by a small number of users on a subset of data, mining can now be a mainstream activity performed by all users in the context of mainstream applications."

Cleaning up the warehouse
Despite its benefits, there is considerable risk in the process of implementing data mining. That risk lies not so much in the solutions themselves, but in the fact that properly utilising the technology is an all or nothing proposition. Without all of your data in the right place--and the right order--even the most intelligent data mining algorithm is going to throw up furphies that cloud the insights it might otherwise provide.

If data entry problems mean one of your customer's surnames is spelt different ways, for example, any analysis of your customer data is going to treat that customer as different people, each with different buying habits. Desmond McGillevray might love to buy Pringles potato chips at your Hurstville store, but Desmond MacGillevray could buy lots of toothpaste in Kogarah while Desmond McGillevry also likes to buy Doritos in Rockdale.

Of course, you're unlikely to be writing down customer names for each grocery purchase, but that's just a practical issue. The point remains: feed this data into a data mining system, and it's going to tell you something different than if you build a long-term profile of Desmond MacGillevry's overall buying habits. Compound this sort of problem to hundreds of thousands of customers, and it's easy to see why many companies have treated data mining more as a goal--to be reached after careful due diligence and data amelioration--than as a single project in itself.

The ability to tie these purchases together is reason enough to implement a loyalty program, where each customer has a unique identifier that reduces the risk of data entry problems and that coalesces later customer support and marketing around a single historical record.

In many cases, data consistency problems are compounded when data mining tools are applied to data culled from several enterprise systems--for example, sales, loyalty program, customer care and marketing databases. Unless enough work on consistency has been done beforehand, it's likely that each of those databases will represent many customers in different ways within the data warehouse.

Feed this data into a data mining system, and you've got the preparations for an informational disaster--which can become even more problematic if you mistakenly act on bad data that you believe is accurate. The solution to this problem lies in careful data checking (automated tools can help this process) and a concerted effort to improve data entry procedures so problem data gets fixed and stays that way.

By bringing complex analytics to large communities of users, today's data mining platforms have broken down many of the barriers that prevented their adoption in the past. Given the demonstrated benefits of data mining to organisations that have pursued it in the past, playing the ease-of-use card has created a clear and compelling business case that, with just a little imagination, can deliver far more relevant, data-enabled applications than ever.

Data intelligence strengthens OneSteel
OneSteel, the recently divested steel manufacturing division of BHP Billiton, manages extraordinarily complex supply chains emerging from the co-ordination of raw materials, region-wide logistics, process manufacturing and marketing in a fiercely competitive global market.

Given the wealth of information its nearly 600 knowledge workers need to process, business analytics have long been an everyday part of life at the $3 billion OneSteel, which has most of Cognos' analytics applications in production in one way or another. Data mining is among the latest of these tools, riding a growing crest of recognition that data is good for far more than simply filling up transactional databases.

That recognition has come as IT and business staff work together to cull the most interesting details from mountains of data generated every day. Every day or so, automated tools pull out fresh data about manufacturing, sales and other parts of the business from OneSteel's JD Edwards, BPCS, and other systems. This data is then loaded into a number of dedicated data marts running on Microsoft SQL Server (this will soon be replaced when OneSteel completes a currently-underway migration to SAP on Oracle), and made available to employees using a variety of analytical tools.

This approach ensures that data is always current, but getting to this point has taken a significant effort in ensuring data consistency, says Will Rigby-Jones, manager of OneSteel's knowledge systems, whose role involves finding new ways to utilise data analysis to help managers improve the business.

"To use data mining in the way it was intended requires a need from the business, and requires quality data captured in the right way, then delivered and presented in the right way," he says. "There's nothing more catastrophic than having one report that says the same thing [and another] that says something different."

Making the jump from reports that contained hundreds of pages of 132-column fanfold pages--which were for years the only way for managers to get business information--to onscreen analytics has required careful attention to users' needs, Rigby-Jones explains. Managers, of course, quickly warm to the ability to prepare reports that might have previously taken days to compile, in just seconds.

Given the size of the business, however, data mining can also swamp them with data, perpetuating an age-old problem. To avoid this issue, the OneSteel team has expended considerable effort to provide interfaces that provide easy access to the most important information for each group of users.

Stoplight-style indicators, built using Cognos Metrics Manager and fed with data from proactive data mining and OLAP analysis, allow managers to easily spot which metrics need attention, then drill down into more detail as necessary. Furthermore, growing utilisation of Web-based interfaces tells Rigby-Jones that increasingly senior managers are recognising the value of the data mining environment.

The ability to spot important multi-factor trends and analyse business data to the nth degree, in near real-time, has sped up workers' ability to use information. At a broader scale, it's also revolutionised management philosophy by putting a visible face on the theory that even several small business changes can compound into significant business improvement. With the right tools now providing a way to weigh up the relative merits of such changes, OneSteel knows more about its business than ever.

"In the past, making money was all about getting the biggest price at the lowest cost," Rigby-Jones says. "But it's actually a whole lot of things. We might find that if you increase the booking rate by x percent, decrease the amount of overtime, save x percent on freight--if you add all these together it might be enough to affect the money in a large way. But if you try to do one of them, you're doomed to failure. It's all about relationships: business is the puppeteer and we provide the strings to pull."

Executive summary: dig for gold, toss the pyrite
Real-time data mining significantly improves both the user community's access to the technology, and the role that data mining can play within everyday business processes. Here are a few tips to make the most of your data:

  • Think hardware. Data mining is extremely computing-intensive, particularly if it's being done continuously in the background as in a real-time environment. Clustered servers will provide the scalability you need to go real-time without affecting overall performance.
  • Clean up your data. Business analytics are impossible to use effectively if your data isn't clean and consistent, yet many companies still haven't resolved chronic discrepancies between the data held in different types of databases. Before going into data mining, get your data under control and figure out how to keep it that way. This often involves people training as much as proper systems.
  • Think outside the cube. OLAP may be great for crunching numbers, but it's inherently limiting because it excludes many types of information that may well be relevant. Use OLAP data marts for users needing to run consistent, regular reports--but when it comes to spotting new trends, point your data mining tools at your full data set.
  • Customers respect knowledgeable staff. But they'll go running if your staff don't have the right data to resolve their issues quickly. Real-time analytics can be an important tool in improving customer care by putting the right information at your customer service representatives' fingertips. That way, they can make informed decisions when it's important to--not later on, after they're off the phone.
  • It's not what's in the data that counts. It's how you use it. Just implementing data mining is only one part of the challenge; the real value of that analysis lies in the ability to turn that information into real business decisions. Make sure managers are trained to think laterally by using data mining to find new and interesting patterns in company databases.
  • Text isn't the same as data. Comments from customers may be hastily typed notes from call centre operators, but they're extremely useful in determining customers' opinions. Yet while numbers are usually contained in well-structured databases, textual information is rarely so ordered--so it's hard to analyse using conventional data mining tools. If you want to pull out trends from textual information, consider companion text mining tools that complement conventional data mining.
  • Think of data mining as a feature. It's usually been made possible by standalone products in the past, but enterprise application vendors are increasingly building it into their databases and applications. This may be a particularly effective approach as it allows data mining environments to leverage the strength of the underlying database or application--and to follow those applications' growth curve by using capabilities such as built-in clustering support.
  • Share data intelligently. It's one thing to use data mining to spot new patterns in your data, but it becomes even more effective when you can feed relevant portions of that data to your suppliers. Noticed customers tend to buy loads of Coke when it's discounted along with Doritos? Make sure your systems can automatically tell the distributors to send you extra volumes so you can keep up with expected demand.
  • Don't overbuy BI. Anecdotal evidence suggests many companies buy large numbers of licenses for analytical tools, then end up using just a few of them as power users warm to the tools and other users reject them. Start low, gauge user demand and increase your licenses from there. A Web client may be an easy way to do this without headaches, since it can be easily offered to new users as demand dictates.
  • The interface is everything. Analysts love data mining since it lets them explore complex data sets. Most users hate it because it lets them explore complex data sets. Since accessibility is the key to data mining success, either get analysts to work closely with business teams, or integrate the mining into other, more user-friendly applications so users are always viewing results in a context that's meaningful to them.

    Subscribe now to Australian Technology & Business magazine.

Advertisement

Talkback 0 comments

Sponsored content

Power Centre - Content from our premier sponsors

Blogs

  • Chris Duckett Get extensions going in Firefox, redux
    Previously on Null Pointer we looked at getting extensions working in Firefox betas, and that was great until the fine folks at Firefox changed their minds.
  • Array How reliable is IP telephony?
    Have you ever heard a weird kind of hissing, crackling or popping noise when calling someone on an IP telephony line? How rare is the phenomenon these days?
  • Array Forget the NBN, 100Mbps is already here
    Telstra and TransACT will shortly begin offering 100Mbps broadband to many customers. By moving early, the companies have not only raised the bar for Australia's broadband services, but thrown down a challenge to a government that now faces increased pressure to deliver the NBN as promised.
  • More blogs »

Tags

Back to top

Featured