Conversion to XML format. Creating an XML Data File and an XML Schema File from Worksheet Data

Quite often, many users of modern computer systems and software products of various types encounter files with the .xml extension. Many people simply have no idea what kind of document this is or how to open it. Now it will be considered. At the same time, we will find out what it is and what it is needed for.

What is an XML file

Let's start, perhaps, with the fact that, from the point of view of modern computer technologies and programs used to create documents of this type, it is a text file in which the commands of the universal extensible markup language are written, which is quite reminiscent of the well-known markup tool HTML.

Typically, an XML file contains general information about an object, which is expressed descriptively (more on this later). As for the data stored in such containers, it can be databases often used for video and audio catalogs on the Internet, saved user preferences for programs and applications, as well as entire web pages.

As an example, you can take, say, an audio album of some artist. The XML file includes information about the year of release, genre, number and names of tracks, popularity, etc. However, when visiting resources on the World Wide Web, the surfer does not need to think about physically opening such an information file, since even when playing a song online in a player All data will be displayed similar to those contained in standard MP3 files in the form of ID3 tags. The information is loaded onto the track being played.

XML file type

If you look at the file, you will immediately notice that in it the properties of any object are described using tags and attributes that are set manually.

We are not talking about the basic commands of the language itself, since the average user does not need such information. The only thing that can be noted is that there is no specific number of elements used to describe an object in this format: how many are needed is specified.

How to open standard XML format

Now let's see how to open an XML file. As many have probably already understood, this is a text file, which means that the easiest way to view or edit it is to use any editor, even the most primitive one. Yes, at least the same “Notepad” from the standard Windows set.

However, everything is not so simple here. The fact is that double-clicking a file without setting the appropriate association with any program will not open. At best, the system will offer a list of the most suitable applications. You can select a program of your choice, and at the same time check the box next to the option to constantly use the selected application for all files of this type.

You can do it differently by right-clicking on the file and then selecting the “Open with...” command, after which, again, select the desired application either from the list, or specify the location of the main executable component (most often this is an EXE file).

The third way to open an XML file is to initially launch the program and then use the file open menu (in most cases this is Ctrl + O). In this case, it is absolutely not necessary to use Notepad. Please, the file opens without any problems in the same Word application and similar ones. Even Microsoft Excel is capable of opening data in this format.

However, if there is a need to edit the XML format, then it is better to use professional utilities that support language syntax, for example, Oxygen XML Editor, XML Marker or EditiX Lite Version. Naturally, these are not all utilities that can work with the file language at the highest level. Today you can find a lot of such programs.

Now a few words about why sometimes an XML file error appears when opening. Most often this is due to a violation of the integrity of the file itself, as well as incorrect introduction of descriptive attributes or tags. Additionally, Excel has a limit on how many rows can be displayed, so in this case the data may not be complete when opened.

Possible errors when opening an XML file as an email attachment

Sometimes errors may appear when you try to open a file that is an email attachment. Most often this applies to standard email clients like Outlook Express.

The fact is that the attachment is first saved as temporary data (very often with an additional .tmp added to the main extension), and it is this that is accessed.

To avoid this situation, you simply need to initially save the attachment in its original format to any convenient location on disk or removable media, and then use the standard methods described above.

Instead of a total

As you can see, there is nothing difficult in understanding the structure and methods of opening files of this format. Here, however, the issue of creating XML data was not fundamentally considered, since to fully understand the process you need to know at least the basics of the language itself. Otherwise, I think that users will not have any difficulties with files of this format.


I have been working with XML files for a long time, but only now I asked myself: How can I create and edit XML files in MS Excel using standard tools? Excel is a convenient and intuitive tool for creating, editing and processing various data. Excel is studied in school and, probably, there is not a person who knows how to work on a PC and does not know the basics of working in Excel. Therefore, in my opinion, this is the most suitable tool that can be recommended to ordinary users for creating and editing data files for various applications, including Web applications. I have MS Excel 2013 installed at home and using its example I will describe the simple procedures that need to be performed when creating an XML data file in Excel. I think this will also work in earlier versions of Excel. To create an XML data file, let's take a few simple steps. Let's look at them using the example of creating an event data file.

1. First, you need to create and fill out a data table in Excel in accordance with the data structure. Please use the menu item for this Insert/Table.

2. Let's create an XML file schema. To do this, enter the following lines in a text editor and save them in a file.

XHTML

0 0 0 0 Bright Resurrection of Christ. Easter 0 0 -7 0 -7 Entry of the Lord into Jerusalem 1

0

0

0

0

Bright Resurrection of Christ. Easter

0

0

-7

0

-7

Entry of the Lord into Jerusalem

1

3. Now open the tab Developer, press the button Source and specify the XML source file. If necessary, enable the menu item Developer in Excel Settings.

4. It remains to compare the elements of the diagram and the columns of the table by dragging the elements to the corresponding column headings.

5. And the last step: export the table data to an XML file using the context menu. Stand on any table cell, right-click, select XML/Export… and save the XML file with the desired name.

Save the Excel file for later editing and addition. Subsequently, paragraphs 1.-4. there will be no need to do this!

P.S. Why did I write this banal note? Firstly, so as not to forget, and secondly, I think that this information will be useful to users, starting with the next version 0.6 ;-)

I would like to talk in one place about a very typical task when working with XML, namely, about the creation XML based on the existing one. This operation will consist of several stages, during which additional tasks will be solved, such as:

  • Creation XML based XSD, and vice versa.
  • Serialization and deserialization of objects.
  • Creating a class for future serialization from XML or XSD.
Before starting, I plan that the reader is familiar with the words XML And XSD. Concerning XML, I think there are no problems with this, but as for XSD I may have questions about what it is. In short:
XSD is a language for describing XML. And if you look at the contents of the file containing this description, the question will arise, why describe the seemingly obvious tags of your XML? But imagine that you are integrating with some external system, and you want to use the format for data exchange XML. So, XSD allows you to describe for each element its name, possible attributes, mandatory fields or attributes, and make it clear that the contents of a tag can contain only one internal element ( ) and no more, or a node may have a sequence of elements (< a> ... ). When the external system is yours, then there may be no problem, but when it is a third party, then this description through XSD, will be a tool for resolving disputes with the format, and also, using utilities, you can check the compliance of any XML original format.
So, what are we going to do to learn how to stamp XML based on your standard? I will describe everything in order.

Available:
Some kind of XML (xmlfile.xml), saved in text format to a file. Below is the text:

< catalog > < book id = " bk101 " > < author >Gambardella, Matthew < title >XML Developer's Guide < genre >Computer < price > 44.95 < publish_date > 2000-10-01 < description >An in-depth look at creating applications with XML. < book id = " bk102 " > < author >Ralls, Kim < title >Midnight Rain < genre >Fantasy < price > 5.95 < publish_date > 2000-12-16 < description >A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.

Steps to create a new one:

  1. First using the utility XSD.exe, or using online services (search for "xml to xsd" one of them www.freeformatter.com/xsd-generator.html), we will create from the existing XML its primitive XSD description. At the output we will have a completely, automatically created output.xsd a file that is already usable but may not be complete.
  2. Next, based on output.xsd , classes will be created in C# xmlclass.cs. After which, we add it to our project. You can create a class using XSD.exe or xsd2code utility, or any online service.
  3. Now we are ready to use the file classes xmlclass.cs. Therefore, let’s create and fill instances with data xmlclass, and then, using the serialization process, we will convert objects into XML line, and then save it to a new file. Thus we get XML based on what is available.
Let's get started!

Step 1: Generate XSD from XML.

So, here are the methods that I use myself:
  1. Utility xsd.exe- creator Microsoft.
  2. Utility xsd2code.exe- a third-party application that can do everything the same as xsd.exe, but is available for download directly (currently it has become paid from Trial period).
  3. Use of any online services.
In order to receive XSD.exe, you will have to put any Microsoft SDKs, which comes with this application. Unfortunately, at the moment, none of the studios include this utility in their kit.

Link to download and install one of SDK:
Windows SDK for Windows 7 and .NET Framework 4.

Let's use the search and find the file xsd.exe, after which (to make our work easier), copy it to the folder created in the root of the disk C:\xmltoxsd. Next, we copy our xmlfile.xml and everything is ready to start.


Open the “Command Prompt” (Start -> “Run” -> “cmd”), go to our directory and call it with default parameters xsd.exe, passing our xmlfile.xml as a parameter:


And we see that a file has appeared next to it xmlfile.xsd with content.


That's all! These actions are enough to create a primitive XSD file to work with. But it is worth understanding that xsd.exe, like any other utility, does not know anything about the types of your fields (so the field type will almost always be string), as well as about all variations of attributes and parameters. That is, if some tag does not have an attribute in your XML, then it won’t be in the description. That's why XML should be the most complete and if the element may contain several child fields, it is better to add at least two, so that xsd.exe I realized that this is a collection.

Syntax XSD not very complicated, so if you are writing a fairly serious description, you may have to work with a file.

Step 2: Create a Class from XSD.

To create a class in C#, we will need to use again xsd.exe but with a different parameter /classes, passing the path to our xmlfile.xsd.


After execution, the xmlfile.cs file will appear, the contents of which are shown below in compressed form:


As you can see, for the inner element catalog a class was created catalogBook, whose fields correspond to the fields in XML, and the field ID marked as an attribute. Xsd.exe I correctly understood what was inside catalog a collection of books is stored, so the property appeared Items, to populate the child collection.
Comment: xsd.exe cannot add comments to the resulting classes and fields. In description XSD there is an element called< xs:annotation > , which most often contains< xs:documentation >, inside which is a description of the field. In theory, it should be placed in the /// description, but at the moment xsd.exe refuses to add text to the description, so you have to resort to alternatives such as xsd2code or online services.

Step 3: Serialize objects to XML.

The resulting classes are in the file xmlfile.cs. Next, add its contents or the file itself to the project Visual Studio, after which you can work with it. I will create demo book objects and then save them to a file in the format XML.

Private void Example() ( // Create the first book var book1 = new catalogBook () ( author = "King" , description = "Very interesting book" , genre = "Fantasy" , price = 22.ToString(), id = "42011" , title = "It" }; !} // Create a second book var book2 = new catalogBook () ( author = "O"Brien, Tim" , description = "Microsoft's .NET initiative is explored in detail in this deep programmer"s reference.", genre = "Computer" , price = 36.ToString(), id = "30012" , title = "Microsoft .NET: The Programming Bible" }; // Create a directory root element containing the two books above var catalog = new catalog() (Items = new(book1, book2)); // Contains the XML of the catalog object var xmlCatalog = Serialize(catalog); // Write a string to a file // TODO Made for demonstration purposes. It is advisable to call the Serialize method to transfer the Stream to the file File .WriteAllText("Output.xml" , xmlCatalog); ) private string Serialize (TType sourceObject) ( if (sourceObject == null ) ( return string .Empty; ) // Use XmlSerializer to convert it to an XML string var xmlserializer = new XmlSerializer(typeof(TType)); var stringWriter = new StringWriter(); using (var writer = XmlWriter .Create(stringWriter, new XmlWriterSettings () ( Indent = true ))) ( xmlserializer.Serialize(writer, sourceObject); return stringWriter.ToString(); ) ) After running and opening Output.xml").

Microsoft Excel is a convenient tool for organizing and structuring a wide variety of data. It allows you to process information using different methods and edit data sets.

Let's consider the possibilities of using it to generate and process web application files. Using a specific example, we will study the basics of working with XML in Excel.

How to create an XML file from Excel

XML is a file standard for transmitting data on the Web. Excel supports its export and import.

Let's look at creating an XML file using the example of a production calendar.

  1. Let's make a table from which you need to create an XML file in Excel and fill it with data.
  2. Let's create and insert an XML map with the required document structure.
  3. Export table data to XML format.

We save the file as XML.

Other ways to get XML data (schema):

  1. Download from a database, specialized business application. Schemes can be provided by commercial sites and services. Simple options are publicly available.
  2. Use ready-made samples to test XML maps. The samples contain the main elements and XML structure. Copy and paste into Notepad and save with the desired extension.


How to save an Excel file in XML format

One of the options:

  1. Click the Office button. Select “Save as” - “Other formats”.
  2. We assign a name. Select the save location and file type – XML.

More options:

  1. Download XLC to XML converter. Or find a service that allows you to export the file online.
  2. Download the XML Tools Add-in from the official Microsoft website. It is freely available.
  3. Opening a new book. Office button – “Open”.

How to open an XML file in Excel

Click OK. You can work with the resulting table as with any Excel file.

How to Convert XML File to Excel

We edit the created table and save it in Excel format.

How to collect data from XML files in Excel

The principle of collecting information from multiple XML files is the same as the principle of transformation. When we import data into Excel, the XML map is transferred at the same time. Other data can be transferred to the same schema.

Each new file will be linked to an existing map. Each element in the table structure corresponds to an element in the map. Only one data binding is allowed.

To configure linking options, open the Map Properties tool from the Developer menu.


Possibilities:

  1. Each new file will be checked by Excel for compliance with the installed card (if we check the box next to this item).
  2. Data may be updated. Or new information will be added to the existing table (makes sense if you need to collect data from similar files).

These are all manual ways to import and export files.

1. Introduction

If any of you have ever tried to learn XML on your own, you may have encountered many of the same confusing concepts that I encountered in the past. DTD, XML Schema, namespaces, XPath, XPointers, XSL, XSLT, DOM, SAX, SOAP, I give up. I will only add that most of these materials are based on implementations, the code of which may contain errors. There are probably millions of ways to implement and use XML, but they can all be quite complex. And you know, XML can be very simple. If we forget about DTD, XML Schemas, namespaces, etc.
In an effort to quickly teach you how to work with XML, I will, if possible, ignore a decent amount of information that you can already read in the relevant literature. And the first thing I'm going to ignore is namespaces and schemas. This may seem strange to you, since most books start by explaining these concepts, but try to think of XML as a means to solve a specific problem, like a hammer. Do you really need to know how to build a house to use a hammer? What if all I need is to just hammer in a nail to hang a picture on? It's the same with XML, it can be very complex, generic enough to be used in hundreds if not thousands of applications, and very simple if you don't pay attention to a few things. In this article, I will focus on solving specific problems using XML.
So what exactly is the problem? Let's assume that I want to describe a simple object, such as a glass, using XML. Why am I going to use XML for this? Well, first of all, that's exactly what XML is designed for. XML describes the data. In my example, the glass is the data. In real life, data can be Word documents, spreadsheets, images, a book, a database record, or even C++ or Visual Basic classes. Secondly, XML is extensible. XML allows me to create as many features as necessary to describe the data, and these features will be whatever I want. And finally, because XML is quickly becoming a standard. If there is life on Mars, then you can be sure that they will be able to understand my XML file there.

What basic properties allow us to describe a glass?

What would the same thing look like in XML format?

glass 6 16 ice cube straw water yes

Notice that the first line of the file () has a special look for now, just remember that it should be here. The beauty of the XML format is that anyone can understand what it says just by looking at it closely. It is also clear that this is not the only possible XML description of the glass. If I ask 10 people to develop an XML description of a glass with the same properties, they will probably all create different but correct descriptions. This is where the problem lies. Perhaps not for us humans, but when a computer reads an XML file, it would be a great idea to let it know what the file is about. This is where namespaces and schemes come into play. Simply put, schemas are used to define an adequate structure for an XML file.
Now it's time to talk about a few simple XML rules that you need to follow:

XML Rule #1: A valid XML file must exactly match its schema. But for ease of understanding of the material, none of my examples will use diagrams. So, strictly speaking, none of my examples are "adequate". But honestly, I don't care. I'm not going to build a house, I just need to hang a picture. I'll talk more about this later when we discuss the XML Document Object Model.

XML Rule #2 Note: If you program in VB, remember: XML is case sensitive. XML is case sensitive. XML is case sensitive. XML is case sensitive. Write this sentence 1000 times and you will never forget.

XML Rule #3: Tags are usually called elements and each opening tag must have a corresponding closing tag. Following this rule, you will get a correct XML file. This is very important because until the XML file is formatted correctly, it will not be parsed and loaded into the Document Object Model. Note that if the element does not contain values ​​and does not contain other (nested) elements, the closing tag may look like instead of a more bulky design . You can see this approach in the previous example ( ).

XML Rule #4: Elements can contain attributes, and attribute values ​​must be enclosed in quotation marks (single or double).

XML Rule #5: You can use attribute names multiple times, but element names must be unique throughout the file. In the previous example, the qty attribute had a different meaning depending on which element it was used in ,, or . The meaning of an attribute depends on the context of its use. Whereas the value of an element always means the same thing, regardless of where in the file the attribute is located. In the previous example, the element And always has the same meaning throughout our document. for example, always used to describe the height of a glass.

XML Rule #6: There are several special characters in XML that cannot be used directly because they are reserved in the XML syntax. Therefore, to use such characters, you will have to use a reserved construction starting with the & character and a special code, (the & character must be written as &) (the " character must be written as ") (the character< должен писаться как <) (символ >must be written as >) and (the " symbol must be written as "). Instead, you can also use the instruction, where "...." can be any sequence of characters except "]]>". This construction can appear anywhere, but it cannot be nested.

2. XML Document Object Model

The XML Document Object Model allows programmers to load the contents of an XML file into memory. Once an XML file is loaded in this way, it can be manipulated using the properties, methods, and events of the Document Object Model. This is where XML comes in handy. The Document Object Model greatly facilitates the retrieval and processing of XML file information. I will not talk here about all the capabilities of the document object model; I will only talk about some of the main features that will help achieve the goal of this article. I'll take the XML file I just created describing the glass, load it into the Document Object Model, and do a few things with it. I will save the remaining features and capabilities of the document object model for the next article, which talks about client XML. Note that although the document object model is very good and convenient for developers, it requires a fairly significant amount of system resources. Therefore, there is another method for parsing XML files, known as SAX. My article does not claim to be an exhaustive source of information on this issue, so it would be useful to also use the XML SDK.

Let's look at an example using Microsoft's XML parser version 3.0 (msxml3.dll) to understand how it all works. If you do not have an analyzer, the latest version can be downloaded from the Microsoft website.
Let's say I saved an example of a cup description in XML format in the file "http://web_server/xml/cup.xml" (local path C:\inetpub\wwwroot\xml\cup.xml) and now I want to load it into the document object model . The following code assumes that the analyzer is already loaded and running.

Visual Basic 6.0 code:(establish connection with Microsoft XML, v3.0) Dim xmlDoc as MSXML2.DOMDocument30 Set xmlDoc = New DOMDocument30 xmlDoc.async = False xmlDoc.validateOnParse = False xmlDoc.load ("c:\inetpub\wwwroot\xml\cup.xml" ) msgBox xmlDoc.xml ASP Server-Side code in Visual Basic: Dim xmlDoc Set xmlDoc = Server.CreateObject("Msxml2.DOMDocument.3.0") xmlDoc.async = False xmlDoc.validateOnParse = False xmlDoc.load "/xml/cup.xml" ASP Server-Side code in Java Script: var xmlDoc = Server.CreateObject("Msxml2.DOMDocument.3.0"); xmlDoc.async = false; xmlDoc.validateOnParse = false; xmlDoc.load("/xml/cup.xml");

Explanation of the above code - let's go through the VB6 code

Line 1: Dim xmlDoc as MSXML2.DOMDocument30

In this first line we define a reference to "Microsoft XML, v3.0". In this line I defined the xmlDoc variable as a reference to the XML document. MSXML2 is a library (use that name, don't try to write MSXML3, it won't work). DOMDocument30 defines an XML document object corresponding to version 3.0. You may also see this code: dim xmlDoc as MSXML2.DOMDocument. This construction is usually used when you do not want to specify a specific version of the XML document. In this case, the default analyzer registered in the system will be used. The only problem may be that the version of the analyzer registered by default may differ on different computers. If you want to be sure that the code you write will work with any version of the analyzer, then do not use constructs that are specific to specific versions of the analyzer. Because there is no guarantee that the user who will use your code will have the same version of the analyzer installed for which you wrote your code. Another advantage of developing code that is version independent of the analyzer is that when a newer version of the analyzer is released, it will be backward compatible with previous versions, and you will not have to recompile your code.

Line 2: Set xmlDoc = new DOMDocument30

This line initializes the xmlDoc variable as a new instance of the version 3.0 XML document object.

Line 3: xmlDoc.async = False

XML files can be loaded either synchronously or asynchronously. If xmlDoc.async = False, it means that the contents of the XML file will be loaded, and only after that control will be transferred to the calling process. If xmlDoc.async = True, it means that control will be transferred to the calling process immediately, without waiting until the contents of the XML file are completely loaded.

Line 4: xmlDoc.validateOnParse = False

This code indicates that the parser should not validate the XML file being loaded against its schema (validateOnParse = False). In order to enable schema compliance checking, you need to write validateOnParse = True.

Line 5: xmlDoc.load("C:\inetpub\wwwroot\xml\cup.xml")

This line calls a method to load the specified XML file. There are two kinds of loading method. The first one, which is written on line 5, loads the file into the document object model, and it is necessary to pass the full path to the XML file. The second loading option involves passing an xml string as a parameter. This type of loading could be called, for example, like this: xmlDoc.loadXML("valid xml string"). I'll show you how to use this method later.

Line 6: MsgBox xmlDoc.xml

This line displays the contents of the downloaded XML file. As a result, we should receive the original XML file that we created earlier.

2.2. Exploring the XML Document Object Model

Create a new project in Visual Basic and name it standard.exe. Paste the above code into your project's main window load method. Make sure you declare the reference to "Microsoft XML v3.0". To do this, click Project-->References, then scroll down the list that appears and find the desired link in it. Please note that the analyzer version 3.0 must be installed on your computer, otherwise it will not be listed. Set breakpoints on the last line of code (msgbox xmlDoc.xml). Run the application in debug mode. When the execution process reaches a stopping point, call the "Locals" window and look at the document object model. You can learn a lot by viewing what is displayed in this window. The "Locals" window should look similar to the image below. Here are some interesting properties of the Document Object Model.

The XML document object model always contains two top-level nodes:

  • Item1 is the root of the document elements branch (ignore it)
  • Item2 is actually the first element of the document (remember this)

nodeName or baseName - can be used when searching for the name of an element or attribute.
nodeType - use to get the type of the current node.
nodeValue - used to find out the value of node data.
childNodes is a collection of child nodes. They can be element nodes, text nodes, and CDATA nodes. There may be other types of nodes, which I won’t talk about now, but you can learn all about them in the XML SDK.
attributes is a collection of attribute nodes for the current element.
length - used to determine the number of nodes in the tree directly belonging to the current one.
xml - this property is present in all nodes and can be used to represent the current position in the document. The XML string starts at the current node and goes down to the end of the tree. This is a very useful feature. Experiment with it and see what happens.

2.2.2. Element nodes

An element node can contain child nodes of elements, attributes, text, or CDATA. From the figure below you can see the following information about the "SOLID" node:

nodeType - Current node type = NODE_ELEMENT - i.e. the current node is an element.
nodeName or baseName or tagName - Name of the current node (element) = SOLID.
Its parent element CONTENTS has 4 children.
This can be seen in the following figure, but SOLID has one child which is a text data type.
text - "ice cube" is a shortcut method that allows you to get the value of the current node without moving to a child text node.

2.2.3. Attribute Nodes

Attribute nodes can only consist of text or CDATA child nodes. The following figure shows what information can be obtained about node "qty":

nodeType - Type of the current node = NODE_ATTRIBUTE - the current node is an attribute.
nodeName or baseName - Name of the current node (Attributes) = qty

From the following figure it is also clear that qty has one child, which has a text data type.
text or value - "2" is a shortcut method that allows you to get the value of the current node without moving to a child text node.

2.2.4. Text and CDATA nodes

Text or CDATA nodes have no children. Text nodes contain the processed text data of their parent node. CDATA contains the raw text data of its parent node. CDATA nodes are created when data in an XML file is framed in a special way. The CDATA label tells the parser not to parse the data and to accept the characters inside the label as data. The CDATA section is especially useful when you need to insert code inside an XML file. The following figure shows what information can be obtained from the current text node:

nodeType - Type of the current node = NODE_TEXT - the current node contains text data.
nodeName - Name of the current node (text) = #text - all text nodes are called #text
data or text or value - "2" is the current node data.

2.2.5. Errors when loading a document

The parseError section of the document object model can be useful in identifying problems that arise when loading an XML document. If I remove the closing tag from OTHER in our example file and try to run the program again, I will get the following result. The first piece of useful information is that our nextSibling now contains the value Nothing. Now, if you look at childNodes, you can see that the length field is now 0. Both of these signs indicate that our XML document was not loaded. To figure out why, I open the parseError node and get all the error information.

So I've shown you how to load an XML file into the Document Object Model, but what do you do with it there? One of the main features that you can use is to perform various queries on an XML document. To do this, you can of course look through the entire document until you find the information you are looking for. But the most preferred way is to use one of two methods of the DOMDocument class. The two methods used to find the nodes in our previous example could look like xmlDoc.SelectSingleNode(patternString) to get the node we're looking for, or xmlDoc.SelectNodes(patternString) to get a list of nodes we're looking for. The patternString parameter is precisely the request. It can be formed in one of two ways. Either as an XSL request or as an XPath request. A newer and preferred way to query an XML document is XPath. The patternString format must be set in advance, before the first call to either of the two data request methods, otherwise the XSL method of forming requests will be used by default. To set the formation type of patternString, use setProperty("SelectionLanguage", "format"). To change the queries in our example to use the XPath way, I'll add the following command: setProperty("SelectionLanguage","XPath"). In my opinion, XPath is the most important XML technology to learn. I'll give you some simple XPath queries. A good place to start learning this technology is the Microsoft XML SDK. Another way to explain this would be to write a simple Visual Basic application that allows you to enter queries and output the result. You may find some free apps that do the same thing, but XPath is fairly new and may not be fully supported by those apps.

2.3.1. Using XPATH to Query the Document Object Model

Let's add some code to the end of our previous example to return the contents of our order book:

GREAT! Let's now add another query that will allow us to determine whether the glass has a lid or not. Add the following code to the end of the previous one:

Set objNode = xmlDoc.selectSingleNode("/CUP/LID") if objNode.text="yes" then MsgBox "We have a lid" else MsgBox "No lid on this cup" end if

Let's go through the code line by line:

Line 1: Dim objNode As IXMLDOMNode

This line defines the objNode variable of type XML document node. It is important to understand that an XML document node is also an object. It doesn't matter. It consists of itself, as well as its attributes and children (childNodes). This way you can cut off unnecessary tree branches, selecting only the ones you need.

Line 2: Dim objListOfNodes As IXMLDOMNodeList

This line defines the objListOfNodes variable, which has the type of list of XML document nodes (group of nodes).

Line 3: xmlDoc.setProperty "SelectionLanguage", "XPath"

This line sets how the patternString is formed as an XPath.

Line 4: MsgBox "Your cup contains the following items:"

Line 5: Set objListOfNodes = xmlDoc.selectNodes("//CONTENTS/*[@qty>0]")

This line performs an XPath query that will return a group of nodes and store them in the objListOfNodes variable. The request is divided into the following parts:

  • //CONTENTS - take all CONTENTS elements in the XML document. Note: // is a shorthand for the entire content of the XML document.
  • /* - from the list of CONTENTS elements, take all (* - used to indicate all) descendant elements. This reduces the result to four element nodes ( ). These four nodes fall directly under the CONTENTS node.
  • [@qty>0] - check each child element to ensure that its qty attribute (@ - means attribute) is greater than 0. If this condition is not met, the node is discarded. Everything inside an XPath request can take the values ​​True or False. If the result is True, then the node is saved. If the result is False, then the node is discarded. After this, our result is reduced to three nodes (

Line 6-8: For Each objNode In objListOfNodes / MsgBox objNode.Text / Next

These rows display the values ​​of each element node that match the query. ("ice cube" , "straw" , "water").

Line 9: Set objNode = xmlDoc.selectSingleNode("/CUP/LID")

This line returns all LID elements that belong to the CUP element, which in turn is derived from the root of the tree (when the query begins with /, this means that you need to start from the root). This is very similar to the path to a file or folder. In our example, this query will return an LID element that contains the value "yes". The important thing here is that I told the request to start at the root element of the XML document. Queries do not always start at the root elements; they usually start at the current node. In our example, this does not matter, since the current node (xmlDoc) is the root element of the XML document (but this is not true in all cases).

Line 10-15: if objNode.text="yes" then / MsgBox "We have a lid" /
else / MsgBox "No lid on this cup" /end if

This line displays the message "We have a lid" because the element's LID text property is "yes".

3. Convert ADO to XML

Now that you understand the basics of XML, let's create an ActiveX control that will convert an ADO dataset into XML format. The goal is to get the book titles from the Titles table of the Pubs database and return them in XML format. I will use the result in my next article. You can say ADO has its own methods to save the result in XML format, right? Yes, but if I trust ADO to do this, I will end up with an XML file in such a terrible format that it will be impossible to work with. ADO will create the XML file using the namespace, and I don't need that at all right now. Secondly, ADO will create an XML file, which will be represented in the form of attributes. In other words, every record will become an element and every field an attribute:

But I would like to receive an XML file in the form of elements, where each entry would be contained in a tag , and each field would be an element inside the tag . My XML string syntax would be:

data from table data from table data from table data from table data from table datafrom table datafromtable

By the way, what I just did was create a schema for my XML string. Now, if I need to check the structure of an XML document against a schema, all I have to do is convert the schema into the correct format. That is, in DTD or XDR syntax. Notice that I have added some attributes to each element . One reason for this is that this information can be used by the client. Prettyname can be used as data labels. The datatype attribute could be used for client-side data validation. But to be fair, the real reason these attributes exist is because they have a special purpose in the XSL file template that I often use to build the where clause of SQL queries. Maybe I'll publish an article demonstrating this approach soon. The template is actually very useful. When the XML structure is applied to the data from the Titles table, the result will look like this:

The Busy Executive's Database Guide BU1032 The Busy Executive's Database Guide business 19.99 4095 An overview of available database systems with emphasis on common business applications. Illustrated. 6/12/1991 Cooking with Computers: Surreptitious Balance Sheets BU1111 Cooking with Computers: Surreptitious Balance Sheets business 11.95 3876 Helpful hints on how to use your electronic resources to the best advantage. 6/9/1991

Now I have something to work with!

Listing 1 - CUP.XML

glass 6 16 ice cube straw water yes

Dim xmlDoc As MSXML2.DOMDocument30 Set xmlDoc = New DOMDocument30 xmlDoc.async = False xmlDoc.validateOnParse = False xmlDoc.Load ("c:\inetpub\wwwroot\xml\cup.xml") MsgBox xmlDoc.xml Dim objNode As IXMLDOMNode Dim objListOfNodes As IXMLDOMNodeList xmlDoc.setProperty "SelectionLanguage", "XPath" MsgBox "Your cup contains the following items" Set objListOfNodes = xmlDoc.selectNodes("//CONTENTS/*[@qty>0]") For Each objNode In objListOfNodes MsgBox objNode. Text Next Set objNode = xmlDoc.selectSingleNode("/CUP/LID") If objNode.Text = "yes" Then MsgBox "We have a lid" Else MsgBox "No lid on this cup" End If

Listing 3 - ActiveX control: ADO in XML (WebClass.dll)(xmlControl.cls)

Option Explicit "Declare Database variables Private m_dbConnection As New ADODB.Connection Private m_dbCommand As ADODB.Command Private m_adoRs As ADODB.Recordset Private m_adoErrors As ADODB.Errors Private m_adoErr As Error Public nCommandTimeOut As Variant Public nConnectionTimeOut As Variant Public strConnect As Variant Public strAppName As String Public strLogPath As String Public strDatabase As String Public strUser As String Public strPassword As String Public strServer As String Public strVersion As String Public lMSADO As Boolean "Private Global Variables Private gnErrNum As Variant Private gstrErrDesc As Variant Private gstrErrSrc As Variant Private gstrDB As String Private gstrADOError As String Private Const adLeonNoRecordset As Integer = 129 Private gtableName(6) As String Private gcolumnName(6) As String Private gprettyName(6) As String Private gdatatype(6) As String Private gfilter(6) As String Private Function OpenDatabase() If Len(strConnect) = 0 Then "set default values ​​If Len(strDatabase) = 0 Then strDatabase = "pubs" End If If nConnectionTimeOut = 0 Then nConnectionTimeOut = 600 End If If nCommandTimeOut = 0 Then nCommandTimeOut = 600 End If If Len (strAppName) = 0 Then strAppName = "xmlControl" End If If Len(strUser) = 0 Then strUser = "sa" End If If Len(strPassword) = 0 Then strPassword = "" End If strConnect = "Provider=SQLOLEDB.1 ; " & _ "Application Name=" & strAppName & _ "; Data Source=" & strServer & "; Initial Catalog="&strDatabase&"; " & _ " User ID=" & strUser & "; Password=" & strPassword & ";" End If "connect to SQL Server and open the database On Error GoTo SQLErr "Enable the error handler With m_dbConnection .ConnectionTimeout = nConnectionTimeOut .CommandTimeout = nCommandTimeOut .Open strConnect "open the database using the connection string End With On Error GoTo 0 "turn off the error handler OpenDatabase = True "database opened successfully Exit Function SQLErr: Call logerror("OPEN") OpenDatabase = False End Function Private Function BuildSQLwhere(tmpWhere) As String "This is for the future End Function Public Function GetTitlesXML (Optional xmlWhere As Variant) As String Dim whereClause As String Dim strSQL As String Call OpenDatabase "open the pubs database If IsMissing(xmlWhere) Then "when the query fails whereClause = "" Else whereClause = BuildSQLwhere(xmlWhere)" convert the query to a correct one sql End If "initialize an sql expression that will query book titles strSQL = "select title_id,title,type,price,ytd_sales,notes,pubdate from titles " & whereClause Call NewRecordSet "create a data set "set cursorlocation m_adoRs. CursorLocation = adUseClient "open the recordset m_adoRs.Open strSQL, m_dbConnection, adOpenForwardOnly, adLockReadOnly, adCmdText "disconnect from the data set Set m_adoRs.ActiveConnection = Nothing On Error GoTo 0 "turn off the error handler "close the database and release the connection Call CloseDatabase If m_adoRs. EOF Then GetTitlesXML = "" "the query did not return any values ​​Else If lMSADO Then GetTitlesXML = msado(m_adoRs) "convert the ado recordset to custom xml End If End If "close the data set Call CloseRecordset Exit Function SQLErr: Call logerror(strSQL) End Function Private Function ADOtoXML(tmprs As ADODB.Recordset, tmpMP As Boolean) As String Dim adoFields As ADODB.Fields "declare a collection for storing fields Dim adoField As ADODB.Field "is used to get each field from the collection Dim xmlDoc As msxml2.DOMDocument30 Dim tmpLine As String "stores an xml representation of each book Dim tmpXML As String "is used to concatenate xml strings Dim i As Integer If tmprs.EOF Then "query did not return any records ADOtoXML = "" Exit Function Else Set adoFields = tmprs.Fields "create a collection of fields End If tmpXML = " " "all books will be enclosed in a tag Do Until tmprs.EOF "loop through each row in data set i = 0 " I is the index of the ado field, which starts at 0 - the first field will be field(0) tmpLine = " " & tmprs("title") & vbCrLf For Each adoField In adoFields "loop through all fields "build xml tag and its attributes for the current field tmpLine = tmpLine & " " & adoField.Value tmpLine = tmpLine & "" & vbCrLf i = i + 1 "move to the next field Next tmpXML = tmpXML & tmpLine & "" & vbCrLf "closing tag after the last field tmprs.MoveNext "next header Loop Set adoField = Nothing "destroy the field object Set adoFields = Nothing "destroy the field collection object tmpXML= tmpXML & ""&vbCrLf" closing tag Set xmlDoc = New msxml2.DOMDocument30 "create xmlDOM xmlDoc.async = False "wait for the document to load xmlDoc.validateOnParse = False "do not check the schema xmlDoc.loadXML(tmpXML) "load the string into the document object model On Error Resume Next "if the file does not exist, then we process this error Kill("c:\temp\custom.xml") "erase the file if it exists On Error GoTo 0 "tell the error handler to interrupt when an error is detected xmlDoc.save ("c:\temp\custom. xml") "save xml to file ADOtoXML=xmlDoc.xml "returns xml string Set xmlDoc=Nothing "destroy document object model End Function Private Function msado(tmprs As ADODB.Recordset) As String Dim xmlDoc As msxml2.DOMDocument30 On Error Resume Next "if the file does not exist, we get an error Kill ("c:\temp\msado.xml") "we erase the file if it exists On Error GoTo 0 "we tell the error handler to abort when an error is detected tmprs.save "c:\temp\msado .xml", adPersistXML "save the xml to a file Set xmlDoc = New msxml2.DOMDocument30 "create the xml document object model xmlDoc.async = False "wait for the xml document to load xmlDoc.validateOnParse = False "do not check the xmlDoc.Load schema ("C: \temp\msado.xml") "load the file into the document object model msado = xmlDoc.xml "return the xml string Set xmlDoc = Nothing "destroy the document object model End Function Private SubCloseRecordset() "close the data set m_adoRs.Close Set m_adoRs =Nothing End Sub Private Sub NewRecordSet() Set m_adoRs= Nothing Set m_adoRs=New ADODB.Recordset End Sub Private Sub CloseDatabase() m_dbConnection.Close Set m_dbConnection =Nothing End Sub Private Sub logerror(errSQL As String) Dim hFile As Integer Dim expFile As String On Error GoTo 0 gnErrNum = Err.Number gstrErrDesc =Err.Description gstrErrSrc = Err.Source Set m_adoErrors = m_dbConnection.Errors For Each m_adoErr In m_adoErrors gstrADOError = m_adoErr.Description & "," & CStr(m_adoErr.NativeError) _ & ", " & CStr(m_adoErr.Number) & "," & m_adoErr.Source _ & "," & CStr(m_adoErr.SQLState) Next hFile =FreeFile If Len(strLogPath) = 0 Then strLogPath = "C:\temp\" End If expFile = strLogPath & strAppName & ".err" Open expFile For Append As #hFile Print #hFile,"************************************* *******" Print #hFile, Now() Print#hFile, "************************************** ****" Print #hFile,"Subroutine: " & tmpPro Print #hFile, "Error Number:" & gnErrNum Print#hFile, "Error Description: " & gstrErrDesc Print #hFile, "Error Source:" & gstrErrSrc Print # hFile, "Ado error String: " & gstrADOError Print #hFile, "Bad SQL: " & errSQL Close #hFile End Sub Private Sub Class_Initialize() strVersion = "xmlControl Version 1. 1" "title_id,title,type,price,ytd_sales,notes,pubdate gtableName(0) = "titles" gcolumnName(0) = "title_id" gprettyName(0) = "Title Identification Number" gdatatype(0) = "number" gfilter(0) = "" gtableName(1) = "titles" gcolumnName(1) = "title" gprettyName(1) = "Title of the Book" gdatatype(1) = "text" gfilter(1) = "" gtableName (2) = "titles" gcolumnName(2) = "type" gprettyName(2) = "Type of Book" gdatatype(2) = "text" gfilter(2) = "" gtableName(3) = "titles" gcolumnName( 3) = "price" gprettyName(3) = "Price of the Book" gdatatype(3) = "number" gfilter(3) = "" gtableName(4) = "titles" gcolumnName(4) = "ytd_sales" gprettyName( 4) = "Year to date sales" gdatatype(4) = "number" gfilter(4) = "" gtableName(5) = "titles" gcolumnName(5) = "notes" gprettyName(5) = "Notes about the book " gdatatype(5) = "memo" gfilter(5) = "" gtableName(6) = "titles" gcolumnName(6) = "pubdate" gprettyName(6) = "Date Published" gdatatype(6) = "date" gfilter (6) = "" End Sub

Listing 4 - VB test application to test WebClass

Private Sub Command1_Click() Dim objWC As xmlControl Dim xml As String Set objWC = New xmlControl objWC.strDatabase = "pubs" objWC.strServer = "ltweb" objWC.strUser = "sa" objWC.strPassword = "" objWC.lMSADO = Option2 .Value objWC.strAppName = "Article1" Text1.Text = objWC.getTitlesXML End Sub

Listing 5 - ASP for testing WebClass