Recently I was working on a project, where a workflow was running through a list of reviewers and was assigning them tasks, but because there it was possible to have multiple tasks assigned to multiple users at the same time, it just wasn't possible to simply use the "Wait for task completion" configuration option.
I decided not to use it and instead to use a "pause" action, so that after tasks are assigned workflow just loops and checks if all or any from the created are completed.
And everything seemed ok, from the architecture point of view, however I realized, that my workflow is actually not... pausing! I tried recreating the pause action, moving it to a different location in the workflow - nothing seemed to help.
Finally I started to analyze how the workflow works, trying to understand why it doesn't pause, as in other workflows this action works fine, so that it cannot be related to any Nintex or SharePoint issues. What I realized a little surprised me.
Workflow configuration
Workflow Manager (that is the platform which runs and handles the workflows in Office 365) has couple of configuration options that are used to keep executions under control (source). In Office 365 however we are not allowed to change them (source), as we can in on-premise SharePoint. These are: Throttle, Batch size, Timeout, Workflow Timer Interval, Workflow Timer Instances. In fact in this case the most important is the "Batch size".
Batch size
This is the number of SPWorkItems that the workflow Timer Job will attempt to complete in one run. The default value is 100. By "one run" means that the workflow will attempt to execute all actions that were put into that batch once started. The Batch should be executed in no longer than 5 minutes (the default value for the Timeout configuration), however actions can be added to several batches.
To be sure, that actions in a single batch are completed, in Nintex on-premise versions we had the "Commit pending changes" action, that forces the workflow to wait for all actions from a batch to be completed, before moving on. However we have nothing similar in Office 365.
Why is workflow not pausing?
In my opinion, per my findings, the reason why the workflow is not pausing is caused by a fact, that it simply misses the time, when it should un-pause. Imagine you have very many actions, running in several loops. After all loops you want your workflow to pause before next runs. But it doesn't. This is because workflow is executing all actions from batches referring to a time when they started. You can see it in a workflow history - time passed between workflow start and current time shows ex. 6 minutes, meanwhile you still see, that your logs to history are all dated using the time when workflow started:
In my example - workflow started at 14:44 and all actions were "executed" at that time. But the "real" time was passing by. Unfortunately, the "pause" action is handled by a service, which checks the current time. And this is what I think - when action to pause occurs, it also occurs at a time when the batch was registered. In my case also at 14:44. It says the workflow to pause for 5 minutes. However when it occurs the real time was 14:52, so 3 minutes after it should un-pause. Seeing that, workflow timer job just ignores that action and moves on. And because it ignores it, it just keeps executing actions using the same time. And in my case - it pushes itself into an infinite loop (not really infinite, because once it reaches 5k items on workflow history list, it gets suspended ).
Solution?
Remember, there is no "Commit pending changes" action in Nintex for Office 365. The only solution I found is just to insert short (1 or 2 minute long) pause actions inside the execution. In my case - after each loop I simply added a 2 minutes pause. This pause allowed my workflow to synchronize with current time. And thanks to that, once that really important pause action was executed, the workflow really stopped. Finally