Should I create a collection for each field

  • 25 February 2017
  • 6 replies
  • 2 views

Badge +9

A while ago I found a looping example on this site (can no longer find it) that had me create a collection for each field.  The collections were then stored in a variable (such as text or integer, etc.) via a Collection Operation.  Once the Collection Operation is complete I did a build string along with other processes using the various variables.

Last week Cassy Freeman opened my eyes to using only 1 collection variable.  See   Site Workflow - Document Review Date Approaching Reminders

Per Cassy:

  1. Your collection variable will have none or more IDs in it.
  2. You then do a for each ID in the collection variable
  3. Inside your for each, you take the current ID in the loop and re-query the list where ID = current ID and pull back the information you need (i.e. assigned to, task name etc) into variables

Thus, my question is when do you set a collection for each field vs. using the Cassy method (set a collection for 1 field - the ID and then create a query that joins the ID from the For Each against all the ID's pulled from the query filter.  It seems like the non-Cassy Method creates a lot of overhead ?


6 replies

Userlevel 6
Badge +15

This is an interesting question, because whenever I've done this, it has always been to gather up IDs, typically to cycle through them and run a workflow on items from other lists. I think what you are calling the "Cassy Method" is the method most folks use!

I am trying to think of when one might use a collection for other variables, and that approach just seems messy to me.. maybe someone else will have a good use-case!

Also I think this ought to be a discussion rather than a question, but we will wait and see what sorts of responses we get here.. happy.png 

‌  

Badge +16

happy.png haha the "Cassy method". Love that!

  

If you read the comments on this you will see ‌ discuss the "Jesse method" which is the alternative you describe above. He went above and beyond (as always) to compare the performance of both approaches. I hope that helps.

Userlevel 5
Badge +12

Hi David,

Neither approach is right or wrong, however, it is the application of the method you choose that counts.   The question becomes: Does the workflow architecture work efficiently and effectively for my goals?   I might write up a blog in far greater details in the future, but consider this simple explanation:

The Query list method that stores ID's in a single collection will probably be a more simple workflow (less time consuming) to build out due to the number of actions required to accomplish this.  However - it will be much more "chatty" - each Query inside of a loop will produce a call to the server that says return me some information.   

Conversely, a single call to return items into multiple collections will be far more complex (and can easily bloat) in regard to the amount of actions needed to cycle through the information, but far less "chatty" as it doesn't have to ping the server each time to request information as it has already stored that into its collection variables.

There are lots of other factors involved: the amount of data we are talking about (rows), the amount of columns (collections), the number of cycles it will go through a loop, and whether or not you are on-prem or in the cloud -- just to name a few.  But I will leave it at that for now.

Thanks,
Mike

Userlevel 6
Badge +12

HA..."Jesse Method"

As Mike Matsako‌ mentioned below, it really depends on your need for the specific process you are building. Both approaches will give you the same result, just different ways of going about it.

I prefer to do one query and get all the information I need upfront. This avoids me from having to make multiple queries later on in order get the details that I need from the items. Also, while this will probably NEVER happen, if you are doing multiple passes to get information, it make change in-between cycles.

Cassy Freeman‌ and I discussed this last year, and as she mentioned, I did do some performance testing. My results were that both approaches were close enough that no one approach was a pure winner. I would guess that if you were querying thousands of items, you would see a larger impact when using a design that queried everyone multiple times rather than just once. 

Again,like Mike said, there are a lot of factors that could affect it. Do what makes sense to you and your process; if you feel it creates more overhead that required, then don't do it. As long as you understand your process and how you built it (and can explain it to someone if needed), you are good to go!

Badge +9

Thanks that makes sense.  I realized that when I did the "Jesse" method I did not have a grasp of the why.  I just did it...Now I understand both approaches (the Cassy and Jesse method).  This will then help me soing forward when determining my approach.  Apologies also as I realize I should have logged this as a discussion.

Userlevel 6
Badge +12

While it is a discussion, I would venture to say that someone has a similar question and is looking for some insight. This gives them something to apply to their own situation and make the best decision.

Reply