There are many libraries out there to convert text into PDF or docx into PDF file. However, recently, I found the problem in Thai language which they usually use "Justify" alignment for their document. But for most of the libraries I found even paid the money one, they have problem in word segmentation. It will result in they can not do Justify alignment properly which very look like to align on the left.
In order to resolve this issue, I had tried to search many possible way to do it which are:
- Use Microsoft Office Interop
- Use Word Automation Services in SharePoint 2013
For the first solution, it is quite to easy to implement it. For example, follow the link below:
However, to use Interop, it won't work well on Server-Client architecture e.g. Web Server. It will similar to open Microsoft Word window for each request, which consume a lot of memory as described here:
The first solution, I tried to implement it once on Web Server but fail due to it has an unknown formatting problem during convert on Web Server (this issue was not found when I try it as Console Application). The PDF file spacing was different from the master docx file for unknown reason.
Later on I found that if we have SharePoint 2013. We can use Word Automation Services which is available from SharePoint 2010 (but as job conversion). To do on demand conversion, it is available on SharePoint 2013. It is very suitable in Web Server solution, and found no issue during convert Thai docx into PDF file.
By the way, in order to run Word Automation Services. It is very important that the machine that run this service must not be the same machine that installed Active Directory, it will result in service does not work (sorry, I can't remember the error message, but it will stuck for awhile before throwing an error). It can be converted from many formats e.g. stream, file location, and byte array.
To use Word Automation Services in C#:
- Setting up Word Automation Service for SharePoint 2013
- Converting a Library’s Word Documents to PDF using Word Automation Services for SharePoint 2013
- SyncConverter.Convert method
Below is the example cut from the reference link above as snippet. Replace "WORD_AUTOMATION_SERVICE" with your registered service name.
using (MemoryStream destinationStream = new MemoryStream()) { //Call the syncConverter class, passing in the name of the Word Automation Service for your Farm. SyncConverter sc = new SyncConverter(WORD_AUTOMATION_SERVICE); //Pass in your User Token or credentials under which this conversion job is executed. sc.UserToken = SPContext.Current.Site.UserToken; sc.Settings.UpdateFields = true; //Save format sc.Settings.OutputFormat = SaveFormat.PDF; //Convert to PDF by opening the file stream, and then converting to the destination memory stream. ConversionItemInfo info = sc.Convert(li.File.OpenBinaryStream(), destinationStream); var filename = Path.GetFileNameWithoutExtension(li.File.Name) + ".pdf"; if (info.Succeeded) { //File conversion successful, then add the memory stream to the SharePoint list. SPFile newfile = library.RootFolder.Files.Add(filename, destinationStream, true); } else if (info.Failed) { throw new Exception(info.ErrorMessage); } }
In the end, I would like to share my research for this issue because it used quite some time before I found the solution. Hope it can help ^^.