Hello all,

I'm a PhD Candidate at the London School of Economics and member of the Church. I recently came across a working paper by Noah Dasanaike at Harvard describing a method for rapid, 'zero-shot digitization of historical documents with vision language models'. Put briefly, he managed to digitise approximately 1.8 million pages of Russian, Spanish, Filipino, Indian, and New Zealand records, including census, civil registration, and electoral roll records. The quality is roughly on-par with records indexed by humans. All for the cost of roughly $2,300.

A benefit of the approach compared to traditional OCR is the ability of VLMs to contextualise a page, its format, and modes of writing (typeset, handwriting, etc.).

I am a big fan of the recent Full Text function added to Family Search, and immediately thought of applications from Dasanaike's approach to unlocking additional records for Church members and others exploring their family history, particularly from areas with limited digitization compared to modern, English-language records.

This leads me to my question: does Family Search's API permit access to Full-Text, or would this be vis-a-vis the individual collections listed here? It's fine either way, I'm just curious about how best to get this set up for my own use and benefit, and to explore a larger use case for rapidly digitizing usable genealogical records. I'm not looking to commercialise anything, there is just a very keen interest on my part from a social science, historical, and faith-based lens.

I describe a practical replication of Dasanaike's approach that I undertook pertaining to family correspondence below.

Practical Replication of Dasanaike for early 20th Century German Correspondence

I am in the process of replicating Dasanaike's approach in two different contexts. The first is with regards to digitising Canadian Royal Commission records (I won't bore you with what those are). The second has been transcribing and translating personal family records and correspondence that were previously inaccessible due to cost or the quality of existing materials. As one example, my family and I have the correspondence my German great-grandfather, who moved to Chile in 1914, with his father back in Hamburg. The handwriting was extremely faded, written in a specific script specific to northern Germany, and, most importantly, none of us speak German. Prior attempts to look into translation revealed it would be a long-wait (if done cheaply) or cost-prohibitive (if done quickly), or somewhere between.

Once I had set up Dasanaike's approach on a local machine, I was able to get a digital transcription of these letters in <5 minutes, followed by a translation. I verified samples of the transcriptions with a German colleague, who said they could not have done better. It was very tender for our family to read great-grandpa Friedrich's letters to home, describing periods of loneliness, followed later by telling his father about meeting his future-wife (and my great-grandmother), how much he loved her, and life in Chile.

Interestingly, the models were able to pick up when my great-grandfather mixed Spanish in with his German correspondence, or viceversa, including Chilean colloquial terms.

The cost for doing this across twenty letters (or fifty pages) of varying quality? About $2.50. The estimated cost for the Royal Commission project, mentioned above, is somewhere between $75–150 for tens of thousands of typed and handwritten documents mixing both English and French together across a few hundred years (documents include both pre- and post-1867 records from when Canada became a country).

Discussions

API integration with Full-Text Search

Practical Replication of Dasanaike for early 20th Century German Correspondence