Cheap Code

Mar 7

Reproducible, Not Reusable

4 Comments

I've been waiting for someone to jump on this and props to Louis Hyman for seeing it, doing it, and sharing it with the world.

the future of historical scholarship is going here: training grad students first in how the data collection/scanning works so they understand the issues/nuances/problems in the data sets, then training on code and stats discussions, all converging on some kind of consensus on coding best practices. Perhaps that is already happening in pockets I'm unaware of (like the Summer Camp Louis leads).

Feels like the shift with be to history what GPS was to Geography, or computational computing's effect on Astronomy... a kind of before/after moment where you can't make progress in the field without mastering the new tools.

2 Questions:

1) how long a lag before this process becomes standardized? Does it stay disaggregated and innovative, or do you think it will tend toward some AHA or other sub-disciplinary group trying to create best practices?

2) how much funding would it take to digitize all probate records? Thinking here of the lifetime of work Loren Schweninger undertook to get probate records in the US South. An army of graduate students could/should digitize the whole thing into OCR-ready files, right? Much of it is in microfilm already. Theoretically you could then unearth immensely valuable insights about economic mobility (both inter- and intragenerational, ethnic, racial, etc.), especially if you could overlay w/ immigration records, bankruptcy filings, etc. Who went up, down, sideways, and how that differed by geography, for instance.

The data collection and OCR files are a one time event that could be "reusable" for all scholars, and the individual codes put on top of that would truly lead to new "repeatable" insights.

All this just to say, Bravo! Here is to more of this. I'm following this Substack w great interest. Thanks for sharing.

Reply (1)

Louis Hyman

Mar 16

Thanks for the enthusiasm! Digitization is the real bottleneck. We will be publishing about digitization soon enough, after we get some legs under us. My plan is to disseminate lots of “how tos”. And so is the AHA. Stay tuned.

Reply (1)

Colin Greenstreet

Apr 2Edited

If you send me your GitHub ID I can give you access to our history-skills-repository [https://github.com/ai-and-history-collaboratory/history-skills-repository]. Lots of How Tos. Also our historians-desktop thinking [https://github.com/ai-and-history-collaboratory/historians-desktop]

Colin Greenstreet

Apr 2Edited

Hi Louis. We should chat. Perfect example of quick vibe coding. Pretty close to one of my potential use cases [Estate and Probate Inventories] in my recent substack [https://generativelives.substack.com/p/history-vibe-coding]. You might want to check out some of the topics we are looking at in the AI and History collaboratory I convene [as a public historian], particularly our recent session on Cowork for Historians, in which we explored creating and using HISTORY-SKILLS.md files to encode workflows for historical research. [https://github.com/ai-and-history-collaboratory/ai-and-history-collaboratory]

Computational History

Cheap Code