Merge multiple word documents into one Open Xml
Using openXML SDK only, you can use AltChunk
element to merge the multiple document into one.
This link the-easy-way-to-assemble-multiple-word-documents and this one How to Use altChunk for Document Assembly provide some samples.
EDIT 1
Based on your code that uses altchunk
in the updated question (update#1), here is the VB.Net code I have tested and that works like a charm for me:
Using myDoc = DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open("D:\\Test.docx", True)
Dim altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString().Substring(0, 2)
Dim mainPart = myDoc.MainDocumentPart
Dim chunk = mainPart.AddAlternativeFormatImportPart(
DocumentFormat.OpenXml.Packaging.AlternativeFormatImportPartType.WordprocessingML, altChunkId)
Using fileStream As IO.FileStream = IO.File.Open("D:\\Test1.docx", IO.FileMode.Open)
chunk.FeedData(fileStream)
End Using
Dim altChunk = New DocumentFormat.OpenXml.Wordprocessing.AltChunk()
altChunk.Id = altChunkId
mainPart.Document.Body.InsertAfter(altChunk, mainPart.Document.Body.Elements(Of DocumentFormat.OpenXml.Wordprocessing.Paragraph).Last())
mainPart.Document.Save()
End Using
EDIT 2
The second issue (update#2)
This code is appending the Test2 data twice, in place of Test1 data as well.
is related to altchunkid
.
For each document you want to merge in the main document, you need to:
- add an
AlternativeFormatImportPart
in themainDocumentPart
with anId
which must to be unique. This element contains the Inserted data - add in the body an
Altchunk
element in which you set theid
to reference the previousAlternativeFormatImportPart
.
In your code, you are using the same Id for all the AltChunks
. It's why you see many time the same text.
I am not sure the altchunkid will be unique with your code: string altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString().Substring(0, 2);
If you don't need to set a specific value, I recommend you to not set explicitly the AltChunkId
when you add the AlternativeFormatImportPart
. Instead, you get one generated by the SDK like this:
VB.Net
Dim chunk As AlternativeFormatImportPart = mainPart.AddAlternativeFormatImportPart(DocumentFormat.OpenXml.Packaging.AlternativeFormatImportPartType.WordprocessingML)
Dim altchunkid As String = mainPart.GetIdOfPart(chunk)
C#
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(DocumentFormat.OpenXml.Packaging.AlternativeFormatImportPartType.WordprocessingML);
string altchunkid = mainPart.GetIdOfPart(chunk);
There is a nice wrapper API (Document Builder 2.2) around open xml specially designed to merge documents, with flexibility of choosing the paragraphs to merge etc. You can download it from here (update: moved to github).
The documentation and screen casts on how to use it are here.
Update: Code Sample
var sources = new List<Source>();
//Document Streams (File Streams) of the documents to be merged.
foreach (var stream in documentstreams)
{
var tempms = new MemoryStream();
stream.CopyTo(tempms);
sources.Add(new Source(new WmlDocument(stream.Length.ToString(), tempms), true));
}
var mergedDoc = DocumentBuilder.BuildDocument(sources);
mergedDoc.SaveAs(@"C:\TargetFilePath");
Types Source
and WmlDocument
are from Document Builder API.
You can even add the file paths directly if you choose to as:
sources.Add(new Source(new WmlDocument(@"C:\FileToBeMerged1.docx"));
sources.Add(new Source(new WmlDocument(@"C:\FileToBeMerged2.docx"));
Found this Nice Comparison between AltChunk
and Document Builder
approaches to merge documents - helpful to choose based on ones requirements.
You can also use DocX library to merge documents but I prefer Document Builder over this for merging documents.
Hope this helps.
The only thing missing in these answers is the for
loop.
For those who just want to copy / paste it:
void MergeInNewFile(string resultFile, IList<string> filenames)
{
using (WordprocessingDocument document = WordprocessingDocument.Create(resultFile, WordprocessingDocumentType.Document))
{
MainDocumentPart mainPart = document.AddMainDocumentPart();
mainPart.Document = new Document(new Body());
foreach (string filename in filenames)
{
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML);
string altChunkId = mainPart.GetIdOfPart(chunk);
using (FileStream fileStream = File.Open(filename, FileMode.Open))
{
chunk.FeedData(fileStream);
}
AltChunk altChunk = new AltChunk { Id = altChunkId };
mainPart.Document.Body.AppendChild(altChunk);
}
mainPart.Document.Save();
}
}
All credits go to Chris and yonexbat