Powershell script to convert RTF docs to plain text using MS-Word

On our current project, we had a bunch of RTF files that had some text in them that we wanted to yank out and store in a database. Instead of laboriously opening and resaving each of the files as plaintext, I decided to write what I had hoped would be a simple PowerShell script to do that for me. What follows is my best try at that script.

I am very open to any questions, criticisms, and improvements in the script, as I’m still very much learning the language. And my fundamental question is really, was PS the right tool to use for this job?

Enjoy!

— bab

=============== Reformat.ps1 ================

function translate_from_rtf_to_text([System.Io.FileInfo] $source_file)
{
    $source_file_name = $source_file.FullName
    $dest_file_name = create_destination_file_name($source_file)

    write-host "Copying from $source_file to $dest_file_name"

    $rtf = $word.Documents.Open($source_file_name)
    $rtf.SaveAs([ref]$dest_file_name, [ref]$saveFormat)
}

function create_destination_file_name([System.Io.FileInfo] $source_file)
{
    $dest_file_name = $source_file.Name.Remove($source_file.Name.Length - $source_file.Extension.Length) + ".txt"
    $dest_directory = $source_file.DirectoryName
    $dest_file = join-path $dest_directory $dest_file_name

    return $dest_file
}

if ($args.Length -ne 1)
{
   write-host "Usage: Reformat.ps1 <path to root directory of rtf files>"
   exit 1
}

$path = $args[0]

write-host Converting all rtf files under directory $path...

$word = new-object -com Word.Application

$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat], "wdFormatTextLineBreaks")

get-childitem -path $path -include *.rtf -recurse | foreach-object -process { translate_from_rtf_to_text $_ }

$word.Quit([ref]$true)

11 thoughts to “Powershell script to convert RTF docs to plain text using MS-Word”

  1. Münevver Karabulut’u <a title="ben 10 oyunlari,ben ten oyunlari" target="_blank" href="http://www.game.gen.tr/kategori-4-32-Ben_10_Oyunlari.html">ben 10 oyunlari</a>öldürdügü gerekçesiyle teslim <a title="avatar oyunlari" target="_blank" href="http://www.game.gen.tr/kategori-4-33-Avatar_Oyunlari.html">avatar oyunlari</a>olduktan sonra <a title="winx oyunlari" target="_blank" href="http://www.game.gen.tr/kategori-4-35-Winx_Oyunlari.html">winx oyunlari</a>tutuklanip Maltepe Çocuk <a title="sue oyunlari" target="_blank" href="http://www.game.gen.tr/kategori-4-30-Sue_Oyunlari.html">sue oyunlari</a>Cezaevi’ne gönderilen Cem Garipoglu <a title="oda oyunlari,dekor oyunlari" target="_blank" href="http://www.game.gen.tr/kategori-4-40-Oda_Oyunlari.html">oda oyunlari</a>ile ilgili sok bir iddia <a title="mario oyunlari" target="_blank" href="http://www.game.gen.tr/kategori-4-41-Mario_Oyunlari.html">mario oyunlari</a>daha ortaya atildi. Karabulut <a title="ameliyat oyunlari" target="_blank" href="http://www.game.gen.tr/kategori-4-43-Ameliyat_Oyunlari.html">ameliyat oyunlari</a>Ailesi’nin avukati Rezzan Epözdemir, yargilama sürecinde “çocuk” muamelesi yapilan Cem Garipoglu’nun, iddia edildigi gibi 17 yasinda degil, 18 yasindan büyük oldugunu <a title="futbol oyunlari" target="_blank" href="http://www.game.gen.tr/kategori-4-39-Futbol_Oyunlari.html">futbol oyunlari</a>öne sürdü.<a title="macera oyunlari" target="_blank" href="http://www.game.gen.tr/kategori-4-4-Macera_Oyunlari.html">macera oyunlari</a>

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.