Getting word file content and cleaning the text from ASCII characters, unwanted linebreaks and Bells

  • 20 January 2020
  • 0 replies
  • 4 views

The bellow code can be used to extract the content of any Word document, will clean it from ASCII characters and line breaks and will output the value in Kryon, as well ass will write the output in text file.

Simply insert the code inside the Advanced command "Run Script" in Kryon.

 

'Funtion to remove the ASCII charachters

Private Function GetStrippedText(DocContent)

Dim regEx

Set regEx = CreateObject("vbscript.regexp")

regEx.Pattern = "[^u0000-u007F]"

GetStrippedText = regEx.Replace(DocContent, "")

End Function

 

'Opening the word file in not visible mode and getting the content

Dim Word

Dim WordDoc

Set Word = CreateObject("Word.Application")

Word.Visible = false

'Open the Document

Set WordDoc = Word.Documents.open("WordDocumentPath",,True)

WScript.Sleep 500

DocContent = WordDoc.Content

 

'Close Word

Word.Quit

'Release the object variables

Set WordDoc = Nothing

Set Word = Nothing

 

'Calling the function to clean the text from ASCII charachters

DocContent1 = DocContent

DocContent = GetStrippedText(DocContent)

 

while DocContent1<>DocContent

DocContent1 = DocContent

DocContent = GetStrippedText(DocContent)

wend

 

'cleaing the text from unwanted dots and breaklines

DocClean = Replace(DocContent,"","|")

DocClean = Replace(DocClean, vbCr, "")

DocClean = Replace(DocClean, vbLf, "")

 

'outputing the text as variable in kryon 

WScript.StdOut.Write(DocClean)

 

'Writing the text into textfile

Set objFileToWrite = CreateObject("Scripting.FileSystemObject").OpenTextFile("TextFilePath",2,true)

objFileToWrite.WriteLine(DocClean)

objFileToWrite.Close

Set objFileToWrite = Nothing

 


0 replies

Be the first to reply!

Reply