I have a very large list, over 19000 items, and each item has an attachment. I wrote a site workflow that uses the for each loop to find items that meet a query criteria and then move them to an "archive" list so they don't clutter up the other list.
When I run the workflow for the first time on this list with 19000 items, it is possible that 18000 items will meet the query criteria.
My questions are:
Solved! Go to Solution.
If i'm not mistaken it will retrieve all 18000 items; I thing the view throttle limit does not apply here. But the workflow may time out (which you might resolve by specifying the "Expected workflow duration" in the workflow settings) and the SharePoint queue fill up. So you definitely need to build in a Pause for action after x processed items (the few occassions I had to process 500+ items I needed to pause every 200 items......).
If real life reveals that there a query is also limitted regarding the number of returned items, then you can set the "Specify item limit" in the Output options section of the Query List action. So the for-each loop needs to be places inside a Loop until the number of items in the Query List result collection = 0.
As someone who has attempted this, I urge you not to do so. Especially not during the day. I tried running mine against a returned number of about 3,000 and I brought the entire server to a hault. It caused the time service to eat up every available bit of memory on the server. Nobody else was able to approve tasks or anything like that.
The fun part was the workflow would error so you would think no harm no foul right? For some reason the timer service would never recover from this. The REAL fun part is if you restart the timer service, the same workflow would run again. Restarting the server itself caused the workflow to rerun as well. Until I manually canceled the workflow, everything we did to fix the timer service would cause the workflow to rerun.
So, we're taking a different route. I exported to excel about 16000 of the 19000 records and uploaded them into the "archive" list. So, it looks like on average the workflow will only have to archive maybe 200 records. Hopefully that won't crash the timer job or the workflow. We shall see.
200 records is most likely going to cause an issue when the workflow tries to persist to the database. When large amounts of data are persisted to SQL its possible for SQL to abort the transaction. When it comes to workflows this generally occurs in situations very similar to what you're describing. In order to get around this you need to batch the foreach loop:
You'll have to play around with the batch size a bit to find out what batch will work based on your environment and workflow. I generally recommend starting at 25 as a first number and then increasing/decreasing to find a batch size that works consistently.
Okay. So, it will most like loop through a list of 3000 - 3500 items to find the 150 to 200 that match the criteria for the collection. If I'm understanding you correctly, I need to pause the workflow after every 25 items that get written to the "archive" list. So, if there are 150 items in the list, then I'll pause it for 5 minutes 6 times before the for each actually ends. Is this correct?