Solved

Regex Extract Line by Line from Multiline Text Field

  • 5 July 2023
  • 3 replies
  • 264 views

Badge +4

Hello!

   Just wondering if this is feasible/possible, I’m trying to extract line by line of a multiline text field (setting as enhanced rich text) using regex extract.

Example:

All the world's a stage,
And all the men and women merely players;
They have their exits and their entrances,
And one man in his time plays many parts,
His acts being seven ages. At first the infant,
Mewling and puking in the nurse's arms.


Goal: Extract each line and save in a collection variable for further processing.

 

Regex:

^\w.*$
Different variant: (?<=^).*(?=[\r\n])

When I try it on regex101.com or even when I paste the text directly, it seems to work, and I’m able to see each line. However, when I test it on Nintex Workflow and reference the actual multiline field, it fails. Am I missing something here? Could it be due to the list item’s multiline text field being set as enhanced rich text?

 

Thank you for any insights! 

icon

Best answer by Garrett 6 July 2023, 05:13

View original

3 replies

Userlevel 5
Badge +13

Check out this blog post. I believe it gets you where you want to go.

Extract line by Line in Nintex workflow - SharePoint Diary

Userlevel 6
Badge +16

Hi @vities 

Bamaeric’s suggestion work great when your multiline text field is plain text. 

Just add another Textbox-Long control “Text2” to your form. Then add a Rule to copy the value and strip out the HTML tags into “Text2”. This works for New Responsive form. 

 

Solution below is using New Responsive form and NAC Workflow

 

Image1 - New Responsive Form with 2 Textbox Long control. Text1 is Rich Text and Text2 is Plain Text

Form Rules - Text2 value equals [Form].[Text1]


Update the Form Rules to replace <br> tags with Newline
replace( [Form].[Text1], "<br>", "\n")

 

Image2 - Replace <br> tags with newline

 

Update the Form Rules to strip HTML tags
replace( replace( [Form].[Text1], "<br>", "\n"), "<[^>]*>", "")

 

Image3 - Strip HTML tags< Final result is plain text

 

Once Text2 control value is in plain text, you should be able to apply Regex to split by newline

 

Result from Log

 

Badge +4

Thank you both @bamaeric for the link and @Garrett for a solution that dealt with enhanced rich text format. In the end, the form was a classic form, and I had permission to convert the textbox into a plain text field and followed the posted blogpost above.


Posted what I did in case someone needs a similar solution for SP 2013 to extract each word/element: 

General gist:

  1. Regex - replace all spaces (\040) with some character (+)
  2. Regex - split line based on \n (\n = line break) - will create a collection (Collection A)
  3. For each (cycle through the collection (Collection A))
    • Set each line into a variable
    • Regex - split line based on “+” - will create another collection (Collection B)
    • For each (cycle through the second collection (Collection B))
      • Create a second index (i.e. a variable that’s defaulted to 0 - Index 2)
      • Using collection operation, perform “Get” and obtain selected word and store in a variable.
      • Manipulate the component as one’d like.
      • Increase the index by +1
    • Increase the initial index (Index 1) by 1, and reset internal index (Index 2 = 0)
  4. Update item with the word/element if needed.

 

Reply