The dilemma revolves around macros embedded in Office 2003 documents. When saved as an XML file, the macros can more or less wind up anywhere. This means that scanners must look at the entire contents of a file rather than just parts where macros are normally positioned.
Although a simple solution has been put forward by the AV community, the software giant has yet to address the issue.
This change is fairly straightforward. The AV companies want a header placed into the file which tells the scanning engine where the macros are exactly located. In order to ensure that viruses don't slip through the cracks, Office applications should only run macros that are pointed to from the header.
Jan Hruska, founder and joint CEO of anti-virus firm Sophos, said that while Microsoft has come a long way in terms of security over the years, the XML issue isn't making life easy.
"Traditionally, when Microsoft had a choice between functionality and security, it has gone for functionality every time," he told ZDNet Australia.
So whilst a more open format such as XML can be very useful, it doesn't make it easier for AV companies to deal with, Hruska said.
"The looser the format, the harder it is to parse," he added.
Because an entire file needs to be scanned, the scanning agent will require more resources, and in the case of mail gateway filtering, may even become susceptible to denial of service attacks if bombarded with a great number of (large) XML files.
Computer Associates manager of virus research, Jakub Kaminski, agreed with Hruska. Although he didn't want to "get into the politics of it all", he said the technical challenges to the AV industry that the issue presents could be huge.
Kaminski also pointed out that once the format has been released, all future office products will support it, thus AV software will have to support it as well.
"Microsoft is certainly willing to co-operate with the antivirus industry," Kaminski said, but noted that "there's a huge argument going on right now...people you talk to...have knowledge but don't have the authority".
Kaminski said the problem stems from the header of the file not containing enough information about macros.
"You can identify by a couple of hundred bytes that it's a word document...however, the problem is to identify that the document contains macros," he said.












I wouldn't feel comfortable using AV software that skips part of a file simply because MS Office won't execute it. What if your office suite of choice is a MS clone that executes all macros regardless of whether or not the header says they're there? Even if no programs will run the code it will still be on your computer. This idea is dangerous corner cutting.
How long can it take to scan a file anyway? I mean a typical hard disk can churn through a gigabyte in under a minute (at least mine can). If scanning an XML file is so processor intensive then scan it once, store its MD5 hash in a tamper resistant file, and only scan it again if it has been changed.