Link Splits
Tip
If you haven’t already looked at Multidimensional Data, familiarizing yourself with the ideas presented there will go a long way towards helping understand Link Splits.
What Is A Link Split?
Link Splits allow you to send a record to an additional Link.
The concept is simple, but the variety of new behaviors this opens up can take time to explore and understand. Let’s dig in.
What Problems Do Link Splits Solve?
Splitting One Row of Data Into Multiple Records
Sometimes you are given a single record but you would like some fields to be placed in one table, and other fields in a different table. Perhaps you have something like this:
{
"FirstName" : "Tim",
"LastName" : "Anchor",
"Company" : "Acme"
}
You’d like to take this record and turn it into an Account called “Acme” with a Contact beneath it called “Tim Anchor”.
Here’s how a Link Split will help. You would configure:
A Link writing to the Account object in Salesforce that has a mapping for
Company
->Account.Name
A Link with no source and whose target is the Contact table with mappings for
FirstName
andLastName
, as well as for$ParentTarget.Id
->Contact.AccountId
A Link Split connecting Link #1 to Link #2
Here’s the flow: a record arrives at Link #1 and cherry-picks the Company
field to make a new Account. Then the same record with all the same fields is delivered to Link #2. This Link doesn’t care about the Company
field, and instead is going to grab FirstName
and LastName
so it can create a new Contact. But there’s a wrinkle…we want our new Contact to be related up to our new Account!
That’s where $ParentTarget
comes into play. Anytime a Link Split is used to pass a record into a new Link, that new Link has access to the FULL record from the previous Link, including what it looked like after transformations! So the second Link can work with the original version of the record, the post-transformation version of the record, or some mix of the two (like we have here). Mapping $ParentTarget.Id
to Contact.AccountId
will populate that Master-Detail field the way we want it, and we’re happy with the result.
If you’d like to read more about $ParentTarget
(and $ParentSource
!), have a look at Schema.
Tip
If you were wondering why Link #2 has no source defined, it’s because any Link run started by a Link Split doesn’t need a source Adapter…we already have the records in the right format for Valence to work with from the previous Link! Sometimes you do want a data source on the second Link. Maybe you have a Link that fetches it’s own records, but occasionally you also send it records from another Link. That’s totally fine, and in fact a very powerful pattern that people actually use (scroll down to Routing Records To An Appropriate Handler to see this in action).
Breaking Down Multidimensional Data
As discussed in Multidimensional Data, a Link Split is a very handy way to take a record that has layers of information in it we care about, and peel one layer off at a time. This is especially important when writing records into Salesforce, which is unable to ingest and work with nested record data.
Let’s look at the example from that page again:
[
{
"Name" : "Acme",
"Website" : "acme.com",
"Contacts" : [
{
"FirstName" : "Jim",
"LastName" : "Jonston"
},
{
"FirstName" : "Samantha",
"LastName" : "McCay"
}
]
}
]
We want to bring this record into Salesforce and use a Link Split to process the information about the company and the information about contacts as separate efforts.
A Link whose source looks like the Acme record above and whose target is the Account object in Salesforce
A Link with no source and whose target is the Contact table
A Link Split connecting Link #1 to Link #2 and specifying the
Contacts
field as its “inner list”
What is an inner list?
This is a configuration option on the Link Split itself when you are setting it up. You can pick a single field that is an array as your inner list. What you are doing is telling Valence that you think for the next Link the records inside this field are really the primary records you’re interested in working with. All of the outer stuff is essentially metadata from the perspective of these inner records, just some extra information that may or may not be useful.
Valence will do some special magic here if you’ve selected an inner list with your Link Split.
When the Link Split passes the Account record down to Link #2, Valence will actually invert the records and they’ll look like this:
[
{
"FirstName" : "Jim",
"LastName" : "Jonston",
"$ParentSource" : {
"Name" : "Acme",
"Website" : "acme.com"
},
"$ParentTarget" : {
"Id" : "0015600000Vf9A9AAJ"
}
},
{
"FirstName" : "Samantha",
"LastName" : "McCay",
"$ParentSource" : {
"Name" : "Acme",
"Website" : "acme.com"
},
"$ParentTarget" : {
"Id" : "0015600000Vf9A9AAJ"
}
}
]
This allows you to quite easily work with these inner records in the next Link, and it makes it especially nice if that Link happened to already exist and already had been configured and set up with mappings and transformations that expected a certain record shape!
This means you could have a setup where you have a Link that knows how to turn some Person record shape into a Contact, and was consuming Person records and turning them into Contacts. Then you could have all kinds of Links that process records that have embedded Person data, and have splits off those Links sending all that person info to the centralized Link that knows what to do with it.
Each record sent from the Link Split to the next Link has its field values and also a copy of the values from its original record, as shown above. If you’d like to better understand $ParentSource
and $ParentTarget
, have a look at Schema.
Note
There is no limit on the number of Link Splits that can be attached to a single Link, and there is no limit on the number of times the same record can be passed down a chain of Links and Link Splits.
Routing Records To An Appropriate Handler
Picture an external system that has tables for companies, people, and deals. You can call into each table to fetch all the records in it, but there’s no mechanism on the table itself to only fetch recent records. Instead, there is a special endpoint called “changesSince” and you pass it a timestamp to get all the records that have been modified since that timestamp.
Here’s the problem: since it’s every record that’s been modified, you’ll get back a mixed collection of companies, people, and deals all jumbled together. Not very convenient!
This is one way you could configure Valence to work with this external system:
Link that pulls from the Companies table and writes to Accounts (understands how to transform a Company record into an Account, with all the mappings and transformations that entails)
Link that pulls from the People table and writes to Contact (understands how to transform a Person record into a Contact, with all the mappings and transformations that entails)
Link that pulls from the Deals table and writes to Opportunity (understands how to transform a Deal record into an Opportunity, with all the mappings and transformations that entails)
Link that pulls from the changesSince endpoint but that does not have a target configured, just a data source
Link Split from #4 to #1, filtering out any record that isn’t a Company
Link Split from #4 to #2, filtering out any record that isn’t a Person
Link Split from #4 to #3, filtering out any record that isn’t a Deal
Here are some takeaways from this setup:
A Link Split can be configured to select which records are delivered to the recipient Link; you do this with a special Filter that implements the LinkSplitFilter interface.
A Link doesn’t have to have a target configured; the router Link doesn’t really make sense to deliver records directly, its job is to send each record to the right secondary Link.
It can be a good idea to have secondary Links still have their own data sources and do their own things with them.
Let’s talk about that third point a little more. Even though in this scenario you have access to a stream of recent changes, you might still need to be able to fetch the entire table of records. Certainly for your initial data load, but perhaps also at some interval as a way to reconcile the data and make sure everything exists in both systems. With this kind of design you’d think of the data stream router as more of a supplement or enhancement to the other Links, rather than the only way you get records from your external system. Maybe you schedule your three main Links to fetch the entire table once a week on the weekend, or biweekly, or once a month. Whatever makes sense for your business!
Sending Data to an Additional System
Sometimes you just want to send the same record to two places. Perhaps you want to send a copy of every record that comes into Salesforce over to your data warehouse to make it available for analytics.
Set up a Link Split, and configure the second Link to push out to the data warehouse external system.