So I have done a lot of document processing and merging processes for our HR team recently. It applies to many teams who have default documents they want to do "something" with that can be dynamic. Documents like invoices, estimates, part lists, etc.
In many cases, you don't control the documents or the business wants some ability to manage these on their own. This leads to inevitable problems w/ typos, errors, etc. that can come up. Therefore, you wind up doing some troubleshooting on individual documents much more than you might like. This led me to building out a larger solution for managing documents as part of some processes. Parts of which I'll share below.
Tagging Dynamic Fields
When I tag dynamic items, I generally put those values inside double square brackets:
- [[myDynamicValue]]
I do a Substitute() command on my document(s) then to replace those w/ the dynamic values. Pretty simple. You can choose to do whatever you'd like, just make sure it is something that isn't normally used in whatever document type you're working with.
However, what isn't simple sometimes is finding if/when people have incorrectly edited a document and/or put in a wrong tag. Now I have a way that adds these automatically for people within an editor, but...users are users.
As I pointed out originally, lots of these documents are not necessarily edited/maintained by a developer, so having a way to catch errors/problems is good.
Finding All Tags in All Documents
Consider the following code:
Now I do hate using RegEx pattern matching because it is not something most people do daily, but in this scenario it is a good use case. What this is doing is matching for anything that matches the pattern of: [[anything in between these brackets]].
This shows me all tags in all documents so I can then make changes globally and/or detect tags that are not currently set up:
In this scenario I'm just going to figure out if I need to fix any non-conforming tags or find any I haven't configured yet for replacement. So I ran the following code to take these results and build out "fixes" for mistakes made by the original document editor (they included spaces and did everything in ALL CAPS when I expected no spaces and camel-case (e.g. [[STAFF LAST NAME]] vs. [[staffLastName]]).
NOTE: This is clunky code designed primarily to fix a specific problem I had vs. something you might be having but I included this for reference.
Again, this was all specifically to fix one person's mix-up and make it easy on myself to edit 40 unique documents w/ ~10-30 dynamic values per document. However, it is an example of how you could build out a process to quickly catch all tags, then make some decisions about what you want to do next.
Fixing Erroneous Tags
Now I'm a fan of doing things incrementally so I have a button that fixes all the values first and that displays in a horizontal gallery so I can review the documents before I commit them to my source library. So this code is run to then fix all of my documents in memory first:
The blue value (letterHTML in my code above) is the specific field where the source for my HTML document exists that contains the tags. So your code would reflect the name of whatever field contains the source document for your scenario.
Note that this goes field by field and replaces it in all documents one at a time.
After that, I write out the temporary information to my document library on a different button once I'm sure I want to make these changes:
Doing Substitutions During Run Time
Now everything above was about managing your source documents and not doing the subs in run-time. For that, it isn't quite so much fun. You'll have to map out your fields so that you're replacing your tags w/ legitimate values. This means you'll have to ultimately map out which Tag gets replaced w/ each value.
The simplest method is something akin to:
If you've got a simple set of tags that are fairly static, then it is best to not overthink it and of course just do a whole series of these. It might look stupid, but it is also very easy to troubleshoot for a novice.
NOTE: In most scenarios I do exactly what you see above. Don't overthink it if you don't have to.
That being said, if you're using dynamic mapping w/ values say where you want certain tags to be replaced w/ certain values from say a JSON data source (Oracle, SharePoint, Excel, etc.), and you want your business partners to control this (I mean...want is probably the wrong word).then you might wind up creating something much more dynamic.
Let's imagine we've already started our app by mapping out all values in a JSON string to a variable where we've created a Collection of tag name (see my blog post here)
This gives us a basic Collection (allRecordsFromSPCol) w/ 3 columns holding a Text, date, and numeric value. However, what's important is we have three columns in our Collection named:
- SP Text Column
- SP Date Column
- SP Numeric Column
If we wanted to give our business partners the ability to map tags to fields, then we'd need some way to allow them to see/pick from fields we'd mapped out. This cannot be dynamically as the JSON needs to be first be mapped
NOTE: That's not 100% accurate. We could do string parsing vs. mapping the JSON, but that is a mess of code that I don't want to scare people with just yet. You can see a snippet of me doing something like this in my blog post here under the section I Write Standalone Test Apps For Large Code Tests
Let's imagine we allow them to map fields to values globally for all documents. As they add/remove mappings, we'll add/remove rows to a Collection that includes the tag and the name of the column from our Collection. So let's first build out a map of all existing tags (based on the code above):
Now we have a Collection w/ two columns of the tagValue (e.g. [[staffName]]) and then a blank string named mappedTo.
NOTE: We are assuming this is a one-to-one map. No combo matches unless you've included the code elsewhere to build those (e.g. combine full name from first/last).
We can now see any tag that doesn't have a field associated with it (i.e. mappedTo ='s "").
As the user maps tags to fields, our Collection gets updated:
In the end, you'll still have something similar to the very similar code at the top of this section that essentially is doing a test to see if a particular field is mapped or not before we do our Substitute(). Of course, you might be processing multiple documents for multiple people all at the same time, so you'll wind up w/ something that looks much more complex but is still kind of simple. A ForAll() for all people and a ForAll() for all documents that then replaces this specific tag/mapped values one at a time. It ultimately comes down to your preference for processing:
The following is a more standalone chunk of code showing how you could do this and look for various tags. I'm still assuming you're using a Collection tagMappingCol as your dynamic map that contains your crosswalk of tags -> Values:
The blue items are the only things that would change within each section that is doing the Substitute() code for each tag. It's just a copy/paste of each section w/ only those items changed.
Final Thoughts
This is tedious but not terrible once you've done it a few times. I really think you should consider STRONGLY if you want to allow the business to manage tags and value matches/replacement. It's a PiTA just from them screwing things up occasionally and not recognizing what they did wrong. However, it does sometimes come up as a request so I think it's worth having a model of how to pull it off.
You can do something similar regardless like I have above, but just not expose the ability for them to do the mapping and instead have this set up on an admin screen that you mange and stores this crosswalk in SharePoint or similar.
I think these kinds of projects can quickly get out of hand when doing large-scale document processes and generating PDFs from HTML source. The long-term changes required can really be crazy. Legal changes for documents might want to save source from prior editions of the application.
I myself have my system set up for several key fields that are used to designate which document applies to which process:
- Year
- Initiative (which is a made up codename for whatever it is that the HR/Legal teams are wanting to do)
- Scenario (which is program-specific baaed on business logic
I also include a "version" field anticipating that at some point they might want different versions operating at the same time, but it hasn't come up yet.
This primarily is just a pattern that I've done multiple times at multiple organizations on multiple platforms over multiple decades. Dynamic document generation is just a thing that has remained a thing for a long time.
We'll see what AI can come up with on this, but I'm fairly certain I'll outlive the companies betting their futures on it.
No comments:
Post a Comment
Because some d-bag is throwing 'bot posts at my blog I've turned on full Moderation. If your comment doesn't show up immediately then that's why.
DIAF Visualpath team