15 June,2012 by Jack Vamvas
Question : I have a list of Microsoft Word Docs , which need to be converted into html . How can I do this with Powershell? I've already checked ConvertTo-Html , but it's a different purpose
Answer: Converting a list of Microsoft Word documents is repetitive and time consuming. Powershell creates a new com object , which filters the Word Doc into html format.
The assembly loads the type [Microsoft.Office.Interop.Word.WdSaveFormat]
This script reads all Word documents in the $srcFiles path and outputs them into the $htmlOutputPath
[void][System.Reflection.Assembly]::LoadWithPartialName('Microsoft.Office.Interop.Word.WdSaveFormat')
$docSrc="C:\word\"
$htmlOutputPath="C:\word\"
$srcFiles = Get-ChildItem $docSrc -filter "*.doc"
$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat], "wdFormatFilteredHTML");
$wordApp = new-object -comobject word.application
$wordApp.Visible = $False
function saveashtml
{
$openDoc = $wordApp.documents.open($doc.FullName);
$openDoc.saveas([ref]"$htmlOutputPath\$doc.fullname.html", [ref]$saveFormat);
$openDoc.close();
}
ForEach ($doc in $srcFiles)
{
Write-Host "Converting to html :" $doc.FullName
saveashtml
$doc = $null
}
$wordApp.quit();
This is only a preview. Your comment has not yet been posted.
As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.
Having trouble reading this image? View an alternate.
Posted by: |