My farm is up to 170 Nintex content databases and I have been remiss in cleaning up old workflow data and history. Part of the reason for this is that we have a legal requirement to preserve some workflow data and try as I may I was never able to get a decisive answer to 'Exactly what data needs to be preserved?". Performance has been off lately so I decided to at least trim down the data from the maintenance workflows that we run. These are workflows that load a list from AD or changes the title of a list item. Nothing business critical and nothing that will land me in jail if I delete it .(kidding...I think).
You can tell how large your Nintex tables are by opening up SQL Server Management Studio, expanding a Nintex Content DB and highlighting 'Tables'. I'm on SQL Server 2014 but this is in earlier versions as well. Now in the ribbon Select 'Object Explorer Details' or simply hit the 'F7' key. Take a look at your WorkflowProgress and your WorkflowInstance tables. Some of our progress tables had 5 or 6 million rows. Across all 170 DBs we were over 100 million rows. That's not a huge number to SQL Server but given the fact that our performance was starting to take a nose dive I saw no reason to keep data that wasn't critical to our business in there. besides, if I didn't clean it up eventually it would get to be a very large number of rows so. Th eWorkflowProgress table is especially prone to bloat sinc eit stores a row for every action in ever workflow instance of every workflow in every site on your farm.
There are a couple of NWADMIN commands that clean this data up for you and these are what do the work in the script. These are PurgeHistoryListData and PurgeWorkflowData. I learned while testing the script that you have to first purge the workflow history list data and then purge the workflow data. If you do it the other way around you will end up with orphaned history list data as there will be no workflow Instance information to work with come time to kill the history. It sounds painfully obvious as I write this but at the time I had no idea but no harm done. It was Dev.
The script will go through every site in your web application and attempt to purge the workflows in the array you pass to it. The script could be modified to also use an array of sites if you wanted to work with a subset.
It is very efficient in its use of CPU and RAM. I didn't notice a significant jump when I fired it up. It can take awhile to complete if you have never cleaned out your old data. It took mine 18 hours to finish in my production environment and a few hours longer on my QA farm but I have a lot of databases.
Enjoy!
PS. after running this on the farm that had a combined row count of over 100 million in the workflowprogress tables we ended up with just over 54 million rows across all 170 databases.