We generate Microsoft Word documents in PHP. Exwog - report generator from Excel to Word using a template Allows you to set framing characters for Excel column names

A, last names in column B and professions in the column C.

2. Create a word document (.doc or .docx)


(A), (B) And (C).

(A), (B) And (C) (A)- name, (B)- last name, (C)- profession.

Settings programs.

3. Select paths for files and folders


Select

4. Specify the sheets and rows of the required data


Excel file data sheets

Excel file data rows Excel file data sheets

1 .

If you want all sheets and/or rows of your excel file with data to participate in the formation of the document, click on the corresponding button with the inscription on the right Numbers(its inscription will change to All).

5. Set a template for naming new word files


Set the naming pattern for new word files:

New word files names template- this is a template for the names of new documents (word files) generated by the program. Here the name pattern contains the column names of the excel file, surrounded by curly braces: (A) And (B). When creating a new document, the program will replace everything (A) And (B) the corresponding cell values ​​from the excel file - this will be the name of the new document (word file).

You can set your framing characters on the tab Settings programs.

6. Click "Generate"


Click the button Generate and progress will appear on the screen. Exactly as many documents (word files) will be created as the number of lines in the excel file involved in the formation.

7. Everything


All documents (word files) have been created and are located in the folder specified in Folder to save the new word files. All:)

Exwog- report generator from Excel to Word using a template

Free Word file generator using a template (Word file) based on Excel file data

Works on Mac OS, Windows and Linux

Allows you to specify the names of new generated word files

Allows you to specify sheets and rows of the required data

Allows you to specify surrounding characters for Excel column names

Easy to use

Store your data in Excel format (.xls and .xlsx) and generate Word files (.doc and .docx) in a few clicks :)


How it works?

Take a look at your excel file


In this example, the excel file contains information about clients. Each line corresponds to a specific client. Names are arranged in a column A, last names in column B and professions in the column C.

Click to view

Create a word document (.doc or .docx)


Click to view

Create a “template” (word file) for generating new documents (word files). Here the "template" text contains the names of the columns of the excel file, surrounded by curly braces: (A), (B) And (C).

The program will generate new documents according to the "template" replacing all (A), (B) And (C) corresponding cell values ​​from the excel file: (A)- name, (B)- last name, (C)- profession.

You can also set your framing characters on the tab Settings programs.

Select paths for files and folders


Select paths for files and folders (buttons labeled Select). In the program you specify the following paths:

Excel file with data (*.xls, *.xlsx)- this is the path to your Excel file with data (customer information);

Word template file (*.doc, *.docx)- this is the path to your “template” (the word file created in the previous step);

Folder to save the new word files- this is the path to the folder in which the program will save new generated documents.

Click to view

Specify the sheets and rows of the required data


Click to view

Specify the numbers of sheets and rows of your excel file with data (customer information) for which you want to generate documents:

Excel file data sheets- numbers of sheets of your excel file that will participate in the formation of new documents;

Excel file data rows- line numbers of sheets (sheets specified in Excel file data sheets) of your excel file which will participate in the generation of new documents. Based on the data of each specified line, a separate document (word file) will be created.

The numbering of sheets and lines in the program begins with 1 .

We live in a world where PHP developers have to interact with the Windows operating system from time to time. WMI (Windows Management Interface) is one such example - interaction with Microsoft Office.

In this article, we'll look at a simple integration between Word and PHP: generating a Microsoft Word document based on input fields in an HTML form using PHP (and its Interop extension).

Preparatory steps

First of all, let's make sure that we have a basic WAMP environment configured. Since Interop is only present on Windows, we need our Apache server and PHP installation to be deployed on a Windows machine. In this capacity, I use EasyPHP 14.1, which is extremely easy to install and configure.

The next thing you need to do is install Microsoft Office. The version is not very important. I'm using Microsoft Office 2013 Pro, but any version of Office older than 2007 should be fine.

We also need to make sure that we have the libraries installed for developing the Interop application (PIA, Primary Interop Assemblies, Basic Interop Assemblies). You can find out by opening Windows Explorer and going to the directory \assembly , and there we should see a set of installed assemblies:

Here you can see the Microsoft.Office.Interop.Word element (underlined in the screenshot). This will be the build we will use in our demo. Please pay special attention to the “Assembly name”, “Version” and “Public key token” fields. We will soon use them in our PHP script.

This directory also contains other assemblies (including the entire Office family) available for use in your programs (not only for PHP, but also for VB.net, C#, etc.).

If the list of assemblies does not include the entire Microsoft.Office.Interop package, then we need to either reinstall Office by adding PIA, or manually download the package from the Microsoft site and install it. For more detailed instructions, please refer to this MSDN page.

Comment: Only the PIA Microsoft Office 2010 distribution is available for download and installation. The version of the assemblies in this package is 14.0.0, and version 15 is supplied only with Office 2013.

Finally, you need to enable the php_com_dotnet.dll extension in php.ini and restart the server.

Now you can move on to programming.

HTML form

Since the bulk of this example is on the server side, we will create a simple page with a form that will look like this:

We have a text field for name, a group of radio buttons for gender, a slider for age, and a text input area for entering a message, as well as the infamous “Send” button.

Save this file as “index.html” in the virtual host directory so that it can be reached at an address like http://test/test/interop.

Server part

The server-side handler file is the main goal of our conversation. To begin with, I will provide the complete code for this file, and then I will explain it step by step.

visible = true; $fn = __DIR__ . "\\template.docx"; $d = $w->Documents->Open($fn); echo "The document is open.


"; $flds = $d->Fields; $count = $flds->Count; echo "There are $count fields in the document.
"; echo "
    "; $mapping = setupfields(); foreach ($flds as $index => $f) ( $f->Select(); $key = $mapping[$index]; $value = $inputs[$key]; if ($key == "gender") ( if ($value == "m") $value = "Mr."; else $value = "Ms."; } if($key=="printdate") $value= date ("Y-m-d H:i:s"); $w->Selection->TypeText($value); echo "!}
  • I assign the field $index: $key the value $value
  • "; ) echo "
"; echo "Processing complete!

"; echo "I'm typing, please wait...
"; $d->PrintOut(); sleep(3); echo "Done!"; $w->Quit(false); $w=null; function setupfields() ( $mapping = array(); $mapping = "gender"; $mapping = "name"; $mapping = "msg"; $mapping = "printdate";

After we've filled the $inputs variable with the values ​​received from the form, and also created an empty element with the printdate key (we'll discuss why we did this later), we come to four very important lines:

$assembly = "Microsoft.Office.Interop.Word, Version=15.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c"; $class = "Microsoft.Office.Interop.Word.ApplicationClass"; $w = new DOTNET($assembly, $class); $w->visible = true;

The COM manipulator in PHP requires the creation of an instance of the class within an “assembly”. In our case, we work with Word. If you look at the first screenshot, you can write down the full assembly signature for Word:

  • “Name”, “Version”, “Public Key Token” - all this is taken from the information that can be viewed in “c:\Windows\assembly“.
  • “Culture” is always neutral

The class we want to reference is always called “assembly name” + “ .ApplicationClass “.

By setting these two parameters we can get an object for working with Word.

This object can remain in the background, or we can put it into working mode by setting the visible attribute to true .

The next step is to open the document that requires processing and write an instance of the “document” to the $d variable.

To create content in a document based on form data, you can take several routes.

The worst thing would be to hardcode the contents of the document in PHP and then output it to a Word document. I strongly recommend not to do this for the following reasons:

  1. You lose flexibility. Any changes to the output file will require changes to the PHP code.
  2. This breaks the separation of control and view
  3. Applying styles to document content (alignment, fonts, styles, etc.) in a script will greatly increase the number of lines of code. Changing styles programmatically is too cumbersome.

Another option would be to use search and replace. PHP has good built-in facilities for this. We can create a Word document in which we will place labels with special delimiters, which will later be replaced. For example, we can create a document that contains the following snippet:

and with the help of PHP we can easily replace it with the contents of the “Name” field received from the form.

It's simple and saves us from all the unpleasant consequences that we encounter in the first method. We just need to decide on the correct separator, in which case we end up using a pattern.

I recommend the third method, and it relies on a deeper knowledge of Word. We will use fields as placeholders, and using PHP code we will directly update the values ​​in the fields with the corresponding values.

This approach is flexible, fast, and consistent with Word best practices. It can also help you avoid full-text search in a document, which is good for performance. I note that this solution also has disadvantages.

Word did not support named indexes for fields from the very beginning. Even if we have specified names for the fields being created, we still need to use the numeric identifiers of these fields. This also explains why we need to use a separate function (setupfields) to match the field index to the field name from the form.

In this demo lesson we will use a document with 5 MERGEFIELD fields. We will place the template document in the same place as our script handler.

Please note that the printdate field does not have a corresponding field on the form. That's why we added an empty printdate element to the $inputs array. Without this, the script will still start and run, but PHP will issue a warning that the printdate index is not in the $inputs array.

After replacing the fields with new values, we will print the document using

$d->PrintOut();

The PrintOut method takes several optional parameters, and we're using the simplest form of it. This will print one copy of the document on the default printer attached to the Windows machine.

You can also call PrintPreview to preview the output before printing it. In a fully automatic environment we will of course use the PrintOut method.

You may need to wait a while before shutting down the Word application, so it takes time to queue a print job. Without delay(3), the $w->Quit method executes immediately and the job is not queued.

Finally, we call $w->Quit(false) , which closes the Word application that was called by our script. The only parameter passed to the method is an instruction to save the file before exiting. We've made changes to the document, but we don't want to save them because we need a clean template for later work.

Once we're done with the code, we can load our form page, fill in some values, and submit it. The images below show the output of the script, as well as the updated Word document:

Improved processing speed and a little more about PIA

PHP is a weakly typed language. COM object of type Object. While writing a script, we have no way to get a description of an object, be it a Word application, a document, or a field. We don't know what properties this object has, or what methods it supports.

This will greatly slow down the development speed. To speed up development, I would recommend writing functions first in C#, and then translating the code into PHP. I can recommend a free IDE for C# development called “#develop”. You can find it. I prefer it to Visual Studio because #develop is smaller, simpler and faster.

Migrating C# code to PHP is not as scary as it seems. Let me show you a couple of lines in C#:

Word.Application w=new Word.Application(); w.Visible=true; String path=Application.StartupPath+"\\template.docx"; Word.Document d=w.Documents.Open(path) as Word.Document; Word.Fields flds=d.Fields; int len=flds.Count; foreach (Word.Field f in flds) ( f.Select(); int i=f.Index; w.Selection.TypeText("..."); )

You'll notice that the C# code is very similar to the PHP code I showed earlier. C# is a strongly typed language, so you'll notice several cast operators in this example, and variables need to be typed.

By specifying the type of a variable, you can enjoy cleaner code and autocompletion, and development speed increases significantly.

Another way to speed up PHP development is to call a macro in Word. We carry out the same sequence of actions, and then save it as a macro. The macro is written in Visual Basic, which is also easy to translate into PHP.

And, most importantly, Microsoft's Office PIA documentation, especially the namespace documentation for each Office application, is the most detailed reference material available. The three most used applications are:

  • Excel 2013: http://msdn.microsoft.com/en-us/library/microsoft.office.interop.excel(v=office.15).aspx
  • Word 2013: http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word(v=office.15).aspx
  • PowerPoint 2013: http://msdn.microsoft.com/en-us/library/microsoft.office.interop.powerpoint(v=office.15).aspx

Conclusion

In this article, we showed how to populate a Word document with data using PHP COM libraries and Microsoft Office interop capabilities.

Windows and Office are widely used in everyday life. Knowing the power of Office/Window and PHP will be useful for every PHP and Windows developer.

The PHP COM extension opens the door to using this combination.

In previous articles in the “Automating Document Filling” series, I talked about how to create the application’s user interface, organize input data validation, and get numbers in words without using VBA code. In this final article we will talk about the magic of transferring all the necessary values ​​from an Excel workbook to a Word document. Let me show you what should happen in the end:

Description of the mechanism

To begin with, I will describe in general terms exactly how data will be transferred to a Word document. First of all, we will need a Word document template containing all the markup, tables, and that part of the text that will remain unchanged. In this template, you need to define the places where the values ​​​​from the Excel workbook will be substituted; this is most conveniently done using bookmarks. After this, you need to organize the Excel data in such a way as to ensure compliance with the Word template, and last but not least, write the transfer procedure itself in VBA.

So, first things first.

Create a Word Document Template

Everything here is extremely simple - we create a regular document, type and format the text, in general, we achieve the required form. In those places where you will need to substitute values ​​from Excel, you need to create bookmarks. This is done as follows:

Thus, you will need to create all bookmarks, that is, mark all the places where data from Excel will be inserted. The resulting file must be saved as a “MS Word Template” using the menu item “File” -> “Save As...”.

Preparing Excel Data

For convenience, I decided to place all the data that needs to be transferred to a Word document on a separate worksheet called Bookmarks - bookmarks. This sheet has two columns: the first contains the names of the bookmarks (exactly as they are named in the Word document), and the second contains the corresponding values ​​​​to be transferred.

Some of these values ​​are obtained directly from the data entry sheet, and some are obtained from auxiliary tables located on the Support sheet. In this article I will not analyze the formulas that calculate the required values; if something is unclear, ask questions in the comments.

At this stage, it is important to correctly indicate all bookmark names - the correctness of data transfer depends on this.

Transfer procedure

But this is the most interesting thing. There are two options for executing the data migration code:

  • The code runs in an Excel workbook, the data is transferred to Word one value at a time and immediately placed in the document.
  • The code is executed in a separate Word document, all data is transferred from Excel in one batch.

From the point of view of execution speed, especially with a large number of bookmarks, the second option looks much more attractive, but requires more complex actions. That's exactly what I used.

Here's what you need to do:

  • Create a Word document template with macro support. This template will contain executable VBA code.
  • In the created template you need to place a program written in VBA. To do this, when editing the template, press the key combination Alt+F11 and enter the program code in the Visual Basic editor window that opens.
  • In an Excel workbook, write code that calls the filling procedure from the newly created Word template.

I will not provide the text of the procedure in the article - it can be easily viewed in the file FillDocument.dotm, located in the Template folder in the archive with the example.

How can you use all this to solve your particular problem?

I understand that in words this all looks very simple, but what happens in practice? I suggest you simply use a ready-made option. Download the archive with the example, in the Excel workbook, press the key combination Alt+F11 to open the Visual Basic editor and read all my comments on the program. In order to change the program to suit your needs, you only need to change the value of several constants; they are located at the very beginning of the program. You can freely copy the entire text of the program into your project.

Archive structure

The archive attached to this article contains several files.

The main file is an Excel workbook called "Creating Confirmations". This workbook has 4 worksheets, of which only two are displayed: “Input” - a data entry sheet and “Database” - an archive of all entered documents.

The Templates folder contains Word document templates. One of them is a template containing a program for filling out bookmarks, and the second is a form to fill out. You can use the template with the program without modifications, but the form to fill out, naturally, will have to be redone in accordance with your needs.

How to rework the example “for yourself”?

  1. Prepare a Word document template to be filled out. Create all the necessary bookmarks in it and save it as a “MS Word template”.
  2. Copy the FillDocument.dotm file from the archive attached to this article to the folder with the prepared template. This file is responsible for filling out the template bookmarks, and nothing needs to be changed in it.
  3. Prepare an Excel workbook for data entry. It's up to you to decide whether it will have any "advanced" user interface and perform various tricky calculations. The main thing is that it contains a worksheet with a table of correspondence between the name of the bookmark in the Word template and the value that needs to be substituted.
  4. Insert the VBA program code from the example file into the prepared workbook. Replace all constants according to your project.
  5. Test for correct operation.
  6. Actively use it!

We continue the topic of working with forms in Word that we started earlier. In previous articles we looked at forms only from the point of view of an “advanced user”, i.e. We created documents that were easy to fill out manually. Today I want to propose expanding this task and trying to use the Content controls mechanism to generate documents.

Before we get down to our immediate task, I want to say a few words about how data for content controls is stored in Word documents (I will deliberately omit how they are tied to the contents of the document for now, but I hope to return to this sometime in the next articles).

A logical question - what is it? itemProps1.xml and similar components? These components store descriptions of data sources. Most likely, according to the developers’ idea, in addition to the xml files built into the document, others were supposed to be used, but so far only this method has been implemented.

Why are they useful to us? itemPropsX.xml? The fact that they list xml schemas (their targetNamespace), which are used in the parent itemX.xml. This means that if we have included more than one custom xml into the document, then to find the one we need, we need to go through itemPropsX.xml components and find the right circuit, and therefore the right itemX.xml.

Now one more thing. We will not manually analyze the connections between components and search for the ones we need, using only the basic Packaging API! Instead, we'll use the Open XML SDK (its builds are available via NuGet). Of course, we haven’t said a word about this API before, but for our task the minimum is required from it and all the code will be quite transparent.

Well, the basic introduction is done, we can start with the example.

According to established tradition, we will take the same “Meeting Report” that we drew in the article. Let me remind you that this is what the document template looked like:

And this is the XML to which the document fields were bound

< meetingNotes xmlns ="urn:MeetingNotes" subject ="" date ="" secretary ="" > < participants > < participant name ="" /> < decisions > < decision problem ="" solution ="" responsible ="" controlDate ="" />

Step 1: Create a Data Model

Actually, our task is not just to generate a document, but to create (at least in a draft version) a convenient tool for use by both the developer and the user.

Therefore, we will declare the model in the form of a C# class structure:

Public class MeetingNotes ( public MeetingNotes() ( Participants = new List (); Decisions = new List (); ) public string Subject ( get; set; ) public DateTime Date ( get; set; ) public string Secretary ( get; set; ) public List Participants ( get; set; ) public List Decisions ( get; set; ) ) public class Decision ( public string Problem ( get; set; ) public string Solution ( get; set; ) public string Responsible ( get; set; ) public DateTime ControlDate ( get; set; ) ) public class Participant ( public string Name ( get; set; ) )

By and large, nothing special, except that attributes have been added to control XML serialization (since the names in the model and the required XML are slightly different).

Step 2: Serialize the above model to XML

The task is, in principle, trivial. What is called “take our favorite XmlSerializer and go”, if not for one thing But

Unfortunately, there appears to be a bug in the current version of Office, which is as follows: if in custom xml before by declaring the main namespace (the one from which Word should take elements to display), declare some other one, then repeating Content controls begin to be displayed incorrectly (only as many elements are shown as were in the template itself - i.e. the repeating section does not work ).

Those. This is the xml that works:

< test xmlns ="urn:Test" attr1 ="1" attr2 ="2" > < repeatedTag attr ="1" /> < repeatedTag attr ="2" /> < repeatedTag attr ="3" />

and this one too:

< test xmlns ="urn:Test" attr1 ="1" attr2 ="2" xmlns:t ="urn:TTT" > < repeatedTag attr ="1" /> < repeatedTag attr ="2" /> < repeatedTag attr ="3" />

but this one is no longer there:

< test xmlns:t ="urn:TTT" xmlns ="urn:Test" attr1 ="1" attr2 ="2" > < repeatedTag attr ="1" /> < repeatedTag attr ="2" /> < repeatedTag attr ="3" />

I tried to submit a bug to Microsoft support on Connect, but for some reason I have no access to submit Office bugs. And the discussion on the MSDN forum didn't help either.

In general, a necessary workaround. If we had generated the XML by hand, there would have been no problems - we would have done everything ourselves. However, in this case, I really want to use the standard XmlSerializer, which by default adds several of its namespaces to the output XML, even if these namespaces are not used.

We will completely suppress the output of our own namespaces in the XmlSerializer. True, this approach will only work if it really doesn’t need them (otherwise they will still be added and just BEFORE ours).

Actually, the entire code (provided that the variable meetingNotes contains a previously populated object of type MeetingNotes):

var serializer = new XmlSerializer(typeof (MeetingNotes));
var serializedDataStream = new MemoryStream();

var namespaces = new XmlSerializerNamespaces();
namespaces.Add(“” , “” );

serializer.Serialize(serializedDataStream, meetingNotes, namespaces);
serializedDataStream.Seek(0, SeekOrigin.Begin);

Step 3. Enter the resulting XML into a Word document.

Here we proceed as follows:

  • copy the template and open the copy
  • find the required custom xml in it (search by namespace “urn:MeetingNotes”)
  • replace the contents of the component with our XML

File.Copy(templateName, resultDocumentName, true ); using (var document = WordprocessingDocument.Open(resultDocumentName, true )) ( var xmlpart = document.MainDocumentPart.CustomXmlParts .Single(xmlPart => xmlPart.CustomXmlPropertiesPart.DataStoreItem.SchemaReferences.OfType ().Any(sr => sr.Uri.Value == "urn:MeetingNotes"!}