I have a email library. When the email is received SP reads the email body and stores the email text in a column. The whole of the email body text is pulled but I only need the first few sentences. How do set up a reg ex to only extract text from the email body up to a certain word.
When you say a certain word, do you mean a given word, such as received?
Then this should do it
If you mean 5 words, then this would do it
but it doesn't take into account any punctuation characters
This interactive site is brilliant for testing out regexs https://regexr.com/
Note, don't use IE, chrome or firefox work just fine
Let me add some details for my query. So the emails received into the library are emails that advise that the email was undeliverable to a list of people. I want to pull out this list people (their email address) into a column.
Example: The blue boxes are the email addresses.
There is lots of text after this part of the email body which I do not need.
It looks like you want to capture anything that looks like an email address from the message body
This regex tester site https://www.regextester.com/19 has a pre-filled regex that appears to do exactly what you want - it matches emails and ignores any other text
This is what I need. The only thing is under the text showed above is the
Diagnostic information for administrators: which contains all the emails address the email was sent too. I only need the failed ones in the to part of the email. If there was a why to only pull the top part of the text this would then work.
I don't have a screenshot to share
assume BodyStr holds the email body
the first regex task would take BodyStr and extract everything up to Diagnostic into TempStr
a second regex task would then process TempStr to extract the emails and place the results into a collection
I had a quick play but it only seems to get the first email address - also, I amended the regex slightly to not match the text up to the string end (ie remove the trailing $)
The issue might be that it cannot do a multiline extract, in which case there should be a previous step that replaces returns with blanks
OK, Ive had a little more time ot look at it this morning
The problem was that it cannot cope with multiline strings, so here goes
1 select everything before Diagnostic into TempStr
2 replace all the line breaks with a space (find [\r\n]+)
3 capture all the remaining email addresses into a collection using the Extract option - ignore the fixed to start and end of string markers:
I used the log history to output the collection and got all of the emails I expected
I hope this helps