By inserting Log to History between every action and reviewing the workflow log revealed the following:
1) A loop iterated more times than expected.
2) "Query XML" is a slower executing service.
3) When "Start a Task Process" sends out the email delivery was held by our Mimecast.com relay for 15 minutes. White-listing all 12 subnets from Mandrillapp.com solved that issue.
As @v-tmasenko has already mentioned...does your workflow use Query XML often? This caused such poor peformance in some of our workflows that we ended up redesigning forms to not have repeating sections or unbound controls.
Also, O365 actions are often very slow to execute e.g. O365 Set Item Permissions. So if you have many of these you could see runtime blow out.
Hello, apologies for the late reply - I seem to have turned notifications off. Yes the workflow does use Query XML, and you're right it does take ages but it seems to get past that point and then hang somewhere random. We are using some set item permissions actions too. I've tried inserting logging and there's nothing suspicious happening. I'm wondering if perhaps Sharepoint itself is running slowly on our tenant and that's causing the problem. At least I know i'm not alone! Thank you.
Update: I think all these replies apply to us - I've just picked the most fitting one as the answer. However I think the main issue is that there is something wrong with our Sharepoint tenant or possibly how we connect to it. If I find out what it is i'll do another update. Thank you everyone for your help.
Actually, I have the same issue, the workflow think first before startup and it takes minutes to get the list item update, I hate this latency