Continue to Site

Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations cowski on being selected by the Eng-Tips community for having the most helpful posts in the forums last week. Way to Go!

Microsoft Word Formats

Status
Not open for further replies.

drawoh

Mechanical
Oct 1, 2002
8,958
I am now running an up-to-date version of Microsoft Office here. I have just created a zip file with drawings and a requisition form. The form is in Microsoft Word format, and I deliberately saved in the old 97/2000 doc format. When I tried to add this to my zip archive, WINZIP simply loaded the file and showed me the XML data.

I am pretty certain that XML is part of the new version of Word. The old format was something binary. I am concerned that the person o the other end of this, may not be able to read the latest version of Word.

Has anyone else seen anything like this?

Critter.gif
JHG
 
Replies continue below

Recommended for you

XML? You are correct, I thought XML is in the new format and not the old one. Are you sure is XML and not Rich Text Format? RTF would make more sense.

On the other hand, maybe BECAUSE you have asked to downgrade, MS-Word may be using XML as an intermediate format.

While XML is NOT the native format of old Word files, it is very likely that old Word does import XML.
 
gsal,

The new OpenDoc and Word formats both are zip archives containing XML files. The files appear to be XML. I did not examine them closely. I don't see why they would be RTF. There is no need to contain an RTF file into an archive.

Critter.gif
JHG
 
RTF was the old format, which is why Wordpad could open them.

TTFN
faq731-376
7ofakss
 
XML is a newer file format that is eventually aimed at replacing HTML across the internet. Newer versions of Office/Word write their data in .xdoc, etc. so that these files include formatting for use on the world wide web. Older versions of Office/Word like 97/2000 versions did not write files with .xml encoding. I don't believe that came about until release of Office 2003 or later. Those earlier versions had file formats like .doc, .xls, etc. Notice the lack of "x" in the file format nomenclature for those older versions.
 
tz101 that info is misleading. It reflects some common confusion about the history and purpose of XML. I'm no expert, but I'd hate for that misinformation to persist.

FWIW, XML is not aimed a replacing HTML because they're not substitutes for each other. You can write your HTML in XML compliant format, and it becomes XHTML (very roughly speaking). It's XHTML that is supposed to supersede HTML, not XML. Lots of people got confused by this distinction and assumed XML was a web language. It's not, it's something far more general than that. Think of it more like a container format like a database than a language format like HTML. It can contain HTML, but it can't replace HTML.

Secondly, newer Office files have .docx extension, not .xdoc.

Thirdly, the files don't necessarily "include formatting for use on the They just use the XML format to encapsulate the file data - the file data is still Word data or Excel data or whatever. Sure, it makes it easier to parse the document using web services, but that's by-the-by. Whatever is parsing the file still needs to understand the actual data content of the file, which could be anything. Just like someone reading a database file could identify each of the fields but not necessarily the content of those fields, someone reading an XML file could identify the elements but wouldn't necessarily know what to do with the content of each.
 
LiteYear,

Thanks.

I have tested that. Word makes an absolute shambles of HTML, hand coded with NOTEPAD. Word is designed to put text on paper, rigorously formatted. HTML is designed to place text on all sorts of things, definitely including computer screens.

XML is a replacement for SGML (Standardized General Markup Language). HTML is an SGML document type. XHTML is an XML document type. Word's text is another XML document type.

Critter.gif
JHG
 
Unless IT's security precautions mess things up of course.

Posting guidelines faq731-376 (probably not aimed specifically at you)
What is Engineering anyway: faq1088-1484
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor