
Introduction
Most of my professional career involving Salesforce has been as a system integrator. For the past two years, I’ve been building ISV and OEM products. Building products has exposed me to some very interesting problems. In this article, I’ll be talking about a technique I’ve come up with to overcome some of those problems. Note, this technique is also applicable outside of product development.
Recently we’ve encountered the CPU governor limit despite doing [almost] everything properly in our triggers. While troubleshooting we’ve observed local code, workflows, and process builder flows executing at the same time as our triggers. Most of the time we’re okay, but now and then the local environment is already so close to the CPU limits that even a small amount of extra processing will put the transaction over a limit. This effect multiplies when workflows or process builder flows perform same-object updates and cause the entire trigger stack to fire for each workflow or process. We’ve also seen situations where the local code has an existing recursion loop in a trigger but happens to be safely beneath the CPU/SOQL limits before the platform detects and terminates the recursion loop. Adding a simple trigger to this existing loop tends to put the transaction over a limit. Remember that local code and configuration share the CPU limit with any managed code that’s installed. Luckily SOQL and other limits are not shared, so we’re less likely to see them.
The Basics
The first thing we all do as good tenants is minimize the number of queries and DML. The next thing we do is reduce the number of loops. Some of you may already be adding recursion protection and caching to triggers. In the next section, I’ll touch on the pitfalls with recursion protection. I think it’s a good thing, but it can also be dangerous. If you’re already adding caching, then I’d love to get some feedback on the techniques outlined here. I won’t be spending much time on ‘the basics’ – assuming that if you’re reading this, you’re already pretty familiar with how to write efficient and responsible code on the platform. What I don’t see very often in code I review is caching of the setup information. It’s just not something that most of us feel is necessary beyond writing efficient and responsible code.
Recursion Flags
Recursion flags have been a pretty standard way of dealing with this scenario so far – or at least the scenario of avoiding SOQL/DML/CPU limits due to recursion. However, there’s a point at which recursion flags become a minefield and something best left alone. I’ll describe a good place to use them (and one or two bad places).
The good place: say you are performing a bunch of work on the Account object, and you have to check field values from the new and old maps, and you are dependent on managed triggers to do work with that you’re checking. In this scenario, most of us would agree that the best, or only place, to put the logic is in the ‘after update’ event. Well, let’s say that you’ve also decided that depending on those differences you now must make another update to the Account coming in on the array. You must use an update statement, which will fire the trigger stack again. This is bad. A super simple – and in my opinion great – way to deal with this is a simple static recursion flag on your trigger class. Example below:
public with sharing class TriggerHandler { public static Boolean run = true; public static void doWork (Map<Id, sObject> oldMap, List<sObject> newList) { if (!run) { system.debug(‘Someone told me to not run!’); return; } // Do setup // Do work // Do updates run = false; update recordsIWasForcedToUpdate; run = true; } }
This way the trigger knows it’s doing something suboptimal and knows to shut itself off. But if there’s outside recursion going on, the trigger will still fire on every update and properly do the work it needs to do. In other words, it will evaluate the before/after diffs every time something else changes the fields, even within the same transaction.
The bad way of protecting against recursion – which is not to say that there are no other good ways other than what I’ve described above, but there are pitfalls I’d like to point out. One of the most common patterns I see shared in the wild has to do with a spin on the Domain pattern, and within those classes, there’s built-in protection using a count. The development staff can use a custom setting to tune how much recursion is allowed. I usually see this set to a value or 1 or 2. With a setting of 2, the trigger is allowed to run a total of three times before it no longer runs. The thought is that we should allow one workflow update, and perhaps one unsafe recursive update from another trigger, but we should go to sleep after that.
Issues:
- Your code suddenly stops evaluating data after it runs a few times. What if the key update was from the 3rd recursive update and now users are complaining about missing data?
- Updates on more than 200 records can trick the count. This one is bonkers, but it happens, and a lot of people don’t realize it. If you update 1,000 records in the execute anonymous console, which is a single transaction, your trigger on that object will fire 5 times. So what happens if your counter isn’t smart enough to tell the difference between recursion, and simply being run 5 times due to 5 chunks of 200 in one transaction? This one nets us bugs where only the first 600 records are evaluated by the trigger when you expected 1,000. I’ve also seen this happen when using the ‘good’ pattern above, but the developer forgets to turn the trigger back on. Then you only see the evaluation happen on the first 200 records.
So….my argument is to use recursion flags for certain things, but not for everything. Instead, just write super tight code, and use cache techniques! Then, the trigger can fire an unlimited number of times without chewing up too many resources each time.
Writing the Cache
Let’s use my earlier example, but add the setup phase.
public with sharing class TriggerHandler { public static Boolean run = true; public static void doWork (Map<Id, sObject> oldMap, List<sObject> newList) { if (!run) { system.debug('Someone told me to not run!'); return; } // SETUP PHASE Map<String, List<Rule__c>> rulesMap = new Map<String, Rule__c>(); for (Rule__c rule : [SELECT Id, Value__c, Field__c, Operator__c FROM Rule__c]) { // if this is first time we've seen this field, create a new inner list if (!rulesMap.containsKey(rule.Field__c)) { rulesMap.put(rule.Field__c, new List<Rule__c>()); } // now that inner list is good, add the rule every time rulesMap.get(rule.Field__c).add(rule); } // Do work // Do updates run = false; update recordsIWasForcedToUpdate; run = true; } }
So, we have a new SOQL, some looping, and some key checking to build the map we’ll use in the working phase to check field values and figure out if we need to do work. While my example is not expensive, it does illustrate the point. My rules are solid, meaning they won’t change shape a lot once they’re created by the application admin. If my workflow field updates cause my trigger to fire twice, why would I query the Rules table twice?
If I’m in a batch and I’m updating 200,000 records, 10,000 records per chunk, do I really need my trigger to make 50 of the same query for every batch chunk? Wouldn’t it make more sense to do the following instead?
public with sharing class TriggerHandler { public static Boolean run = true; private static Map<String, List<Rule__c>> rulesMap; public static void doWork (Map<Id, sObject> oldMap, List<sObject> newList) { if (!run) { system.debug('Someone told me to not run!'); return; } // SETUP PHASE if (rulesMap == null) { rulesMap = new Map<String, List<Rule__c>>(); for (Rule__c rule : [SELECT Id, Value__c, Field__c, Operator__c FROM Rule__c]) { // if this is first time we've seen this field, create a new inner list if (!rulesMap.containsKey(rule.Field__c)) { rulesMap.put(rule.Field__c, new List<Rule__c>()); } // now that inner list is good, add the rule every time rulesMap.get(rule.Field__c).add(rule); } } // Do work // Do updates run = false; update recordsIWasForcedToUpdate; run = true; } }
Now I’m only performing the query once, and I’m only running through the setup loop once. This adds up during big operations like the batch and helps reap significant time rewards. It also enhances the user experience when users click on things, every millisecond counts. It also makes you far less likely to have customers complaining about managed packages hitting CPU or SOQL limits.
There is a tradeoff…..
What’s the tradeoff?
There’s no such thing as a free lunch. However, in my experience memory, or heap, is a lot cheaper than SOQL or CPU time. By storing the map as a static variable, it’s being held in a heap between each trigger invocation. Your mileage may vary, and you may decide that the patterns I’ve described so far are not worth it to you. Watch your CPU, watch your SOQL, and watch your heap.
What about Platform Cache?
If you’re lucky enough to have access to an allocation of platform cache, or your customers do, then the same pattern can be used here. There are a few more considerations, but here’s the basic idea.
public with sharing class TriggerHandler { public static Boolean run = true; private static Map<String, List<Rule__c>> rulesMap; public static void doWork (Map<Id, sObject> oldMap, List<sObject> newList) { if (!run) { system.debug('Someone told me to not run!'); return; } // SETUP PHASE // try for cache first if (CacheUtil.getIsCacheEnabled()) { if (rulesMap == null) { rulesMap = CacheUtil.getRulesMapFromCache(); } } // if no cache, do it the old way if (rulesMap == null) { rulesMap = new Map<String, List<Rule__c>>(); for (Rule__c rule : [SELECT Id, Value__c, Field__c, Operator__c FROM Rule__c]) { // if this is first time we've seen this field, create a new inner list if (!rulesMap.containsKey(rule.Field__c)) { rulesMap.put(rule.Field__c, new List<Rule__c>()); } // now that inner list is good, add the rule every time rulesMap.get(rule.Field__c).add(rule); } if (CacheUtil.getIsCacheEnabled()) { // We did the work to build it, let's cache it CacheUtil.putRulesMapToCache(rulesMap); } } // Do work // Do updates run = false; update recordsIWasForcedToUpdate; run = true; } }
Here we just added the step of retrieving from the cache first. If the cache can’t give me my setup map, then I’ll do it the old way. The code assumes that you’re using a utility class I didn’t cover here that makes failing gracefully, storing cache, and retrieving cache easy. Emphasis on failing gracefully!
Platform cache has the added bonus that once the maps are in the platform cache, you now need zero setup queries or loops, even the first time your code is hit during a transaction. Platform cache is super cool in every way. Stay tuned for an article that summarizes my experience implementing Platform Cache in a managed package.
Conclusion
So there it is. It’s such a simple thing that I’ve almost never seen in the wild. It’s also something I’ve never considered to be necessary, until these last few years experiencing a much larger variety of problems. I’m sure there are others out there doing it already, and I’d love to see some other patterns.