Syncing Family Trees
This high-level guide explains how to synchronize a third-party family tree with a [CET] Tree. It outlines key practices that will help ensure that data is accurately merged and that conflicts are resolved. It is assumed that the third-party software has successfully uploaded a [CET] and has stored Person IDs (pids), Conclusion IDs, and a timestamp marking the completion of the upload. For more information, see the Uploading a Tree to [CET] guide.
System Actors
The following system actors will be referenced throughout this guide:
- End-User: The user managing the third-party and [CET] trees.
- Collaborators: Invited users with rights to make changes to the [CET] tree. See [CET] Access Model for more detail.
- Third-party Software: The software integrating with the FamilySearch API. This software is responsible for running the sync process.
- FamilySearch API: The interface through which the third-party software interacts to read and update the [CET] tree.
Summary of the Process
The general process for synchronizing trees involves the following steps:
- Determine FamilySearch Changes: Determine what has changed in the FamilySearch Research Tree since the last time the applications were in sync.
- Determine Third-Party Tree Changes: Determine what has changed in the third-party tree since the last time the applications were in sync.
- Compare Changes: Identify differences and conflicts between the two datasets.
- Resolve Conflicts: Determine how to handle conflicting changes.
- Create Change Plan: Create a plan for applying changes to each tree.
- Apply the Changes: Apply changes to both trees.
Determining FamilySearch Changes
The sync process needs to retrieve the latest changes that have occurred in the [CET] tree. This is done by requesting data from the Tree Changes feed starting from the time or bookmark from the last sync or upload.
This step should result in data describing the changes to potentially be applied to the third-party tree.
Technical Implementation
Determining Third-party Tree Changes
The sync process needs to determine the latest changes that have occurred in the third-party application since the time of the last sync. It is up to the third-party application to determine the method for tracking and gathering these changes.
The result of this step should be data describing the delta to potentially be applied to the [CET] tree.
Comparing Changes
The sync process compares the information from the [CET] changes and the third-party application changes to determine if there are any conflicts. Conflicts could include changes that have been made to the same conclusion, relationship, or data element in both trees.
The result of this step should be data describing the pieces of information that have conflicts.
Resolving Conflicts
The sync process determines how conflicts should be resolved. Here are some options to consider when changes are made in both trees to the same pieces of data:
- Changes made on FamilySearch.org override third-party changes.
- Third-party changes override change made on FamilySearch.org.
- Last change wins based upon the timestamp.
- Hold the conflicting changes for a user to determine which piece of data to keep.
The result of this step is data describing the deltas to be applied to either tree.
Create Change Plan
The sync process uses the information from previous steps to finalize the plan for how changes will be applied. Some software applications may choose to display the plan to the user for approval prior to executing the changes to either side.
The result of this step is a finalized plan for deltas to apply to each tree in the next step.
Applying Changes
The sync process applies changes to both trees. The third-party application tree is updated through its internal mechanisms for data management. The [CET] tree is updated through standard POST and DELETE operations to the appropriate endpoints.
Note: To change or delete a non-vital [CET] conclusion, a Conclusion ID is needed. The third-party application will need to determine how to best acquire the Conclusion IDs for this purpose.
Once the sync process finishes uploading changes to the [CET], an updated timestamp should be saved, which will be used as the starting point for reading the Tree Change History the next time a sync is performed.
Other Considerations
Consider the following when writing the sync process.
Changes Outside of the Tree
The Tree Change History endpoint only provides information about changes to the tree data. Changes to data such as Memories, Source Descriptions, and Discussions, are not included in the Tree Change History. For example, if a user were to change the URL or Citation text of a Source Description, that will not show up in the Tree Change History.
The change history will include changes to Source References, Memory References, and Discussion References. This means that your application will be able to detect when a source is attached to a person or when a source is removed.
Source Descriptions, Memories, and Discussions do not have change history feeds, nor do they support ETags. If tracking changes to these resources is desired, you should fetch and store a copy of the resource every time you sync so that you can compare it the next time you sync.
Sync Interruptions and Error Handling
The sync process may encounter system or connectivity problems which could cause interruptions. The process may need to consider how to resume an interrupted sync and how to deal with a sync that has been aborted.
Changes Created by Upload and Sync
The Tree Change History provides changes that were made through FamilySearch.org and through the API, including changes made during sync operations. The sync process should take this into consideration when storing timestamps or Tree Change History bookmarks.
Updated about 2 months ago