If someone sends an XML file containing data in tables, you won't have to read all the text and all the tagged angle brackets. You can load this document directly into Excel, tell Excel how to display the document, and work with the data using maps.
In the last few years, XML (Extensible Markup Language) has become a common format for exchanging information, and it is not unusual for people and organizations to send XML files to each other. The simple structures underlying XML make exchanging information extremely simple, regardless of whether all parties are using the same software and browsers. However, until recently, although general XML utilities have become widespread, bridging the gap between XML documents and the user interface was still difficult. Microsoft Excel makes this task easier, at least for data in a table grid.
This trick uses Excel features that are only available in Excel for Windows older than 2003. Earlier versions of Excel do not support them; These features are not supported in current or planned versions of Excel for Macintosh.
Let's start with the simple XML document shown in Listing 8.1.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
<
?xml version-
"1.0"
encoding-
"UTF-8"
?>
<
sales>
<
sale>
<
date>
2003
-
10
-
05
date>
<
isbn>
0596005385
isbn>
<
title>Off1ce 2003 XML Essentia1s
title>
<
priceus>
34.95
<
quantity>
200
quantity>
<
customer IO=
"1025"
>Zork "s Books
|
// Listing 8.1. A simple XML document for parsing in Excel< ?xml version-"1.0" encoding-"UTF-8"?>
This document can be opened directly in Excel using the File → Open command. A dialog box will open (Fig. 8.1).
If you select the As an XML list radio button, you will see a warning that Excel will create its own schema for this document, which does not have a schema (Figure 8.2).
By clicking OK, you will see which Excel way chose to present the information in the document being opened in the form of a spreadsheet (Fig. 8.3). Note that Excel expects to encounter the date format that is used for the date element, so dates imported as 2003-10-05 will appear as 10/5/2003.
Now that your document is loaded into Excel, you can process the data just like you would any other data in Excel - inserting it into formulas, creating named ranges, building charts based on the content, etc. To help you, Excel has several built-in data analysis capabilities.
Drop-down lists in the column headers allow you to choose how the data is sorted (by default, the data is displayed in the order in which it was recorded in the source document). You can also enable display of the Total line; To do this, you can use the List toolbar or click right click mouse anywhere in the list and in context menu select the command List → Total Row. When the summary line appears, you can select the type of summary information in the drop-down menu (Fig. 8.4).
Rice. 8.4. Selecting totals for an XML list in Excel
Data can be updated by adding information from an XML document with the same structure to the area being updated. If you have another document with this structure, you can right-click the list, select XML → Import from the context menu, and select the second document. Additionally, after editing, the data can be exported back to an XML file by right-clicking the list and selecting XML → Export from the context menu. This turns Excel into a very convenient tool for editing simple XML documents with a tabular structure.
If the data is simple enough, you can often trust Excel to choose how to present the contents of the file and use the default settings provided. If the data gets more complex, especially if it contains dates or text that looks like numbers, then you may want to use XML schemas to tell Excel how to read the data and what data will fit in a given map. For our document, the XML schema might look like Listing 8.2.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | < ?xml version= "1.0" encoding= "UTF-8" ?> < xs: schema xmlns: xs= "http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" >< xs: element name= "sales" > < xs: complextype> < xs: sequence> < xs: element maxOccurs= "unbounded" ref= "sale" > xs: element> xs: sequence> xs: complextype> xs: element> < xs: element name= "sale" > < xs: complextype> < xs: sequence> < xs: element ref= "date" > xs: element> < xs: element ref= "ISBN" > xs: element> < xs: element ref= "T1tle" > xs: element> < xs: element ref= "PriceUS" > xs: element> < xs: element ref= "quantity" > xs: element> < xs: element ref= "customer" > xs: element> xs: sequence> xs: complextype> xs: element> < xs: element name= "date" type= "xs:date" > xs: element> < xs: element name= "ISBN" type= "xs:string" > xs: element> < xs: element name= "Title" type= "xs:string" > xs: element> < xs: e1ement name= "PriceUS" type= "xs:decimal" > xs: e1ement> < xs: element name= "quant1ty" type= "xs:integer" > xs: element> < xs: element name= "customer" > < xs: complextype mixed= "true" > < xs: attribute name= "ID" use = "required" type= "xs:integer" > xs: attribute> xs: complextype> xs: element> xs: schema> |
// Listing 8.2. Schema for book sales data< ?xml version="1.0" encoding="UTF-8"?>
Note that the date element is defined as a date, and the ISBN element is defined as a string, not an integer. If you start by opening this diagram rather than the document, you will force Excel to load the document by storing the leading zero in the ISBN.
This time, you'll create the list before loading the XML document, starting with a blank worksheet. You will need to open the XML Source task pane. If it is not already open, press the keyboard shortcut Ctrl+Fl. Then, from the drop-down list at the top of the task pane, select XML Source and you will see something similar to Fig. 8.6.
To download the diagram, click the XML Maps button. The XML Maps dialog box will open (Figure 8.7).
Click the Add button to open the schema and select the schema (Figure 8.8). If the schema does not limit documents to one starting element, a dialog box appears asking you to select a root element. Since the documents in this example begin with the element sales, select "sales".
When you click OK, a warning will appear warning you that the diagrams may be difficult to interpret. XML Schema is a huge specification that supports an extremely large number of structures that do not fit the way Excel understands information, so Excel has some limitations.
In the XML Maps dialog box, Excel will indicate that the diagram has been added to the spreadsheet. If you click OK, you return to the main Excel window and a diagram showing the schema structure appears in the XML Source task pane. Now that you have the structure, you can create the list. The easiest way to do this, especially with small documents like ours, is to drag the sales icon onto cell A1.
Now that you've set up a home for your data, you need to move it in. You can click the Import XML Data button on the List toolbar, or right-click the list and select XML → Import from the context menu. If you select a file that you opened earlier (in Listing 8.1), you will see the result as in Fig. 8.3. Note the addition of leading zeros to the values, which are now text as they should be.
You can also drag items individually if you want to rearrange them, or place different pieces of information in different places in the spreadsheet.
Excel's support for XML maps and lists means you can create spreadsheets that work with data that comes in separate files with more flexibility than was possible with previous formats such as CSV (comma-separated value) or tab-delimited format .
Instead of connecting to a database to edit data interactively, the user can edit the XML file while on the plane and transfer it to the customer immediately upon landing. Perhaps the best feature of the new features XML Excel- this is their flexibility. As long as the data is organized in a structure that follows a table grid, Excel has very few rules about what XML types can be sent there. With a few clicks and no programming at all, you can integrate XML data into spreadsheets.
Announcement
XML Document File Format
Having been created as a format for storing text data, XML format is a document that is understood not only by people, but also by machines. XML is a platform-independent language that was created to store various types of data. With simplicity and ease of use, given language not inferior in popularity HTML language. It is also quite common on the Internet. The fact that XML files can be easily edited by the simplest text editors, only increases its popularity.
Technical information about XML files
An XML document is a sequence of characters in which each Unicode character can be found in each individual document. This Unicode character code that makes up an XML document is divided into tokens and text content based on simple syntax rules. This format has an important advantage over HTML: XML supports arbitrary placement of tags to clearly define the data that such tags cover.
Additional information about the XML format
File extension | .xml |
File category | |
Example file | (252.17 KiB) |
Related programs | Microsoft Visual Studio 2013 JAPISoft EditiX Wattle XMLwriter MacroMates TextMate |