<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Computational History]]></title><description><![CDATA[A clearinghouse for AI methods in History, where historians swap techniques and perspectives]]></description><link>https://computationalhistory.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!Qigr!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187b411b-680b-41a2-b0f7-e0ea6f262a55_1024x1024.png</url><title>Computational History</title><link>https://computationalhistory.substack.com</link></image><generator>Substack</generator><lastBuildDate>Thu, 21 May 2026 13:10:27 GMT</lastBuildDate><atom:link href="https://computationalhistory.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Louis Hyman]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[computationalhistory@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[computationalhistory@substack.com]]></itunes:email><itunes:name><![CDATA[Louis Hyman]]></itunes:name></itunes:owner><itunes:author><![CDATA[Louis Hyman]]></itunes:author><googleplay:owner><![CDATA[computationalhistory@substack.com]]></googleplay:owner><googleplay:email><![CDATA[computationalhistory@substack.com]]></googleplay:email><googleplay:author><![CDATA[Louis Hyman]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Presidential Feelings]]></title><description><![CDATA[Using an LLM to measure executive vibes]]></description><link>https://computationalhistory.substack.com/p/presidential-feelings</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/presidential-feelings</guid><pubDate>Mon, 18 May 2026 13:10:55 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!usVL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff428831c-91cc-4b42-a799-c26e9605c254_4465x2367.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1><strong>Reading Between the Lines: What Happens When You Run 32,000 Presidential Speeches Through an Emotion Classifier</strong></h1><p>If you work in American political history, you probably have a good intuition about which presidents were combative, which were optimistic, and when American political rhetoric started to feel different. Using an <a href="%5Bj-hartmann/emotion-english-distilroberta-base%20%C2%B7%20Hugging%20Face%5D(https://huggingface.co/j-hartmann/emotion-english-distilroberta-base)">emotion classification Large Language Model</a> it is possible to check these intuitions in a systematic way against 3.9 million sentences of presidential speeches from the American Presidency Project (APP) archive. The results are often surprising, offering a quantitative window into the structural architecture of persuasion and the measurable drift of political culture.</p><h2><strong>The Method:</strong></h2><p>The technology underlying this analysis is <strong>DistilRoBERTa</strong>, a transformer-based language model. The model was trained on a massive corpus of human-annotated data. Thousands of sentences were labeled by human readers who reached a consensus on whether a line expressed anger, disgust, fear, joy, sadness, surprise, or remained emotionally neutral. Once trained, the model &#8216;reads&#8217; one sentence at a time, and assigns a probability that the sentence belongs to one of these categories.</p><p>The key analytical unit is the sentence rather than the document. This granularity is what gives the dataset its flexibility. It&#8217;s the difference between knowing a speech is &#8220;angry&#8221; and knowing precisely <em>where</em> in the speech the predominantly &#8216;angry&#8217; sentences concentrate, and how their concentration compares to different historical base-lines.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p><h2><strong>The Macro Trend:</strong></h2><p>At the macro level, perhaps the most revealing variable is the proportion of sentences the model classifies as <strong>Neutral</strong>. It is useful to think of &#8220;Neutral&#8221;, as more than just a proxy for the levels of emotion generally, but also as a proxy for what we might term the &#8216;procedural tone of governance.&#8217; It represents factual reporting, policy detail, and institutional communication.</p><p>When we track this across 112 years, we see a measurable shift from the President as a &#8220;Chief Executive&#8221; (procedural) to the President as a &#8220;Communicator-in-Chief&#8221; (affective), as the media environment changed, and the expections surrounding presidential communication changed with it.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4hyw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd11f7db-d1ef-445e-b11e-12c9e761c0ed_4770x2970.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4hyw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd11f7db-d1ef-445e-b11e-12c9e761c0ed_4770x2970.png 424w, https://substackcdn.com/image/fetch/$s_!4hyw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd11f7db-d1ef-445e-b11e-12c9e761c0ed_4770x2970.png 848w, https://substackcdn.com/image/fetch/$s_!4hyw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd11f7db-d1ef-445e-b11e-12c9e761c0ed_4770x2970.png 1272w, https://substackcdn.com/image/fetch/$s_!4hyw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd11f7db-d1ef-445e-b11e-12c9e761c0ed_4770x2970.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4hyw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd11f7db-d1ef-445e-b11e-12c9e761c0ed_4770x2970.png" width="1456" height="907" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd11f7db-d1ef-445e-b11e-12c9e761c0ed_4770x2970.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:907,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:409507,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/198013070?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd11f7db-d1ef-445e-b11e-12c9e761c0ed_4770x2970.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4hyw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd11f7db-d1ef-445e-b11e-12c9e761c0ed_4770x2970.png 424w, https://substackcdn.com/image/fetch/$s_!4hyw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd11f7db-d1ef-445e-b11e-12c9e761c0ed_4770x2970.png 848w, https://substackcdn.com/image/fetch/$s_!4hyw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd11f7db-d1ef-445e-b11e-12c9e761c0ed_4770x2970.png 1272w, https://substackcdn.com/image/fetch/$s_!4hyw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd11f7db-d1ef-445e-b11e-12c9e761c0ed_4770x2970.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>The data reveals distinct plateaus:</p><ol><li><p><strong>The WWI-era Volatility (approx. 1913&#8211;1921):</strong> We see intense emotionality here, with neutrality dipping below 50% in 1918. However, we must be cautious: the volume of content in the APP database for this era is extremely sparse compared to more recent decades. It is also worth noting that the apparent &#8220;decline&#8221; in volume in the most recent years of the dataset is not a sign of quieter presidents, but of a shift towards social media platforms, which are not as well captured in the APP data.</p></li></ol><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pDEU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea38c08f-7e17-47a7-a223-293930bdf4f8_3565x1763.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pDEU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea38c08f-7e17-47a7-a223-293930bdf4f8_3565x1763.png 424w, https://substackcdn.com/image/fetch/$s_!pDEU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea38c08f-7e17-47a7-a223-293930bdf4f8_3565x1763.png 848w, https://substackcdn.com/image/fetch/$s_!pDEU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea38c08f-7e17-47a7-a223-293930bdf4f8_3565x1763.png 1272w, https://substackcdn.com/image/fetch/$s_!pDEU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea38c08f-7e17-47a7-a223-293930bdf4f8_3565x1763.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pDEU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea38c08f-7e17-47a7-a223-293930bdf4f8_3565x1763.png" width="1456" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ea38c08f-7e17-47a7-a223-293930bdf4f8_3565x1763.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:119304,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/198013070?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea38c08f-7e17-47a7-a223-293930bdf4f8_3565x1763.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pDEU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea38c08f-7e17-47a7-a223-293930bdf4f8_3565x1763.png 424w, https://substackcdn.com/image/fetch/$s_!pDEU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea38c08f-7e17-47a7-a223-293930bdf4f8_3565x1763.png 848w, https://substackcdn.com/image/fetch/$s_!pDEU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea38c08f-7e17-47a7-a223-293930bdf4f8_3565x1763.png 1272w, https://substackcdn.com/image/fetch/$s_!pDEU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea38c08f-7e17-47a7-a223-293930bdf4f8_3565x1763.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><ol start="2"><li><p><strong>The Mid-Century Consensus (late 1960s&#8211;1990s):</strong> Following the war, and the emotional Kennedy era, rhetoric settled into a high-neutrality plateau. For several decades, presidential speech remained largely deliberative and procedural, averaging around 60&#8211;65% neutrality.</p></li><li><p><strong>The Polarized Drift (1998&#8211;Present):</strong> Beginning with the Clinton impeachment and accelerated by 9/11, we see a sustained decline in neutrality that has never reversed. By the 2020s, we reached historical highs of emotionality. 2024 and 2025 represent the angriest years in the entire 112-year dataset, with anger scores doubling the long-run average.</p></li></ol><p>This 25-year trend in US presidential speech is not an isolated phenomenon. It mirrors findings from other researchers, such as <a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0276367">Rozado, Hughes, and Halberstadt (2022)</a>, who used the same model and found a similar longitudinal increase in emotionality and negativity in news headlines. The presidency data therefore offers an interesting corroboration to this broader picture of a polarizing information ecosystem.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!usVL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff428831c-91cc-4b42-a799-c26e9605c254_4465x2367.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!usVL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff428831c-91cc-4b42-a799-c26e9605c254_4465x2367.png 424w, https://substackcdn.com/image/fetch/$s_!usVL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff428831c-91cc-4b42-a799-c26e9605c254_4465x2367.png 848w, https://substackcdn.com/image/fetch/$s_!usVL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff428831c-91cc-4b42-a799-c26e9605c254_4465x2367.png 1272w, https://substackcdn.com/image/fetch/$s_!usVL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff428831c-91cc-4b42-a799-c26e9605c254_4465x2367.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!usVL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff428831c-91cc-4b42-a799-c26e9605c254_4465x2367.png" width="1456" height="772" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f428831c-91cc-4b42-a799-c26e9605c254_4465x2367.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:772,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:184091,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/198013070?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff428831c-91cc-4b42-a799-c26e9605c254_4465x2367.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!usVL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff428831c-91cc-4b42-a799-c26e9605c254_4465x2367.png 424w, https://substackcdn.com/image/fetch/$s_!usVL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff428831c-91cc-4b42-a799-c26e9605c254_4465x2367.png 848w, https://substackcdn.com/image/fetch/$s_!usVL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff428831c-91cc-4b42-a799-c26e9605c254_4465x2367.png 1272w, https://substackcdn.com/image/fetch/$s_!usVL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff428831c-91cc-4b42-a799-c26e9605c254_4465x2367.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2><strong>Emotional Fingerprints:</strong></h2><p>Aggregating data by administration produces what I call &#8220;emotional fingerprints,&#8221; distinct profiles that capture rhetorical personality and governing context.</p><p>The &#8220;Joy&#8221; category is particularly revealing. To an emotion classifier, &#8220;Joy&#8221; encompasses national pride, optimism, and reassurance. Presidential rhetoric is fundamentally aspirational; even in times of crisis, the president is expected to provide a celebratory vision of national resilience.</p><p>Republicans consistently employ more joyful language (15.3%) than Democrats (13.4%). Republican rhetoric has traditionally leaned into national optimism. Conversely, Democrats register higher levels of angry language (3.2% vs 2.6%). This is suggestive perhaps of a rhetorical style that more often frames policy problems as injustices demanding confrontation and reform.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dU7r!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe34e5a5-eff2-4086-9225-cc15d26280a5_4123x2981.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dU7r!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe34e5a5-eff2-4086-9225-cc15d26280a5_4123x2981.png 424w, https://substackcdn.com/image/fetch/$s_!dU7r!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe34e5a5-eff2-4086-9225-cc15d26280a5_4123x2981.png 848w, https://substackcdn.com/image/fetch/$s_!dU7r!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe34e5a5-eff2-4086-9225-cc15d26280a5_4123x2981.png 1272w, https://substackcdn.com/image/fetch/$s_!dU7r!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe34e5a5-eff2-4086-9225-cc15d26280a5_4123x2981.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dU7r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe34e5a5-eff2-4086-9225-cc15d26280a5_4123x2981.png" width="1456" height="1053" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fe34e5a5-eff2-4086-9225-cc15d26280a5_4123x2981.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1053,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:238897,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/198013070?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe34e5a5-eff2-4086-9225-cc15d26280a5_4123x2981.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dU7r!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe34e5a5-eff2-4086-9225-cc15d26280a5_4123x2981.png 424w, https://substackcdn.com/image/fetch/$s_!dU7r!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe34e5a5-eff2-4086-9225-cc15d26280a5_4123x2981.png 848w, https://substackcdn.com/image/fetch/$s_!dU7r!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe34e5a5-eff2-4086-9225-cc15d26280a5_4123x2981.png 1272w, https://substackcdn.com/image/fetch/$s_!dU7r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe34e5a5-eff2-4086-9225-cc15d26280a5_4123x2981.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>The data makes familiar reputations concrete, and is suprising in some cases. The data seems to support the &#8220;No Drama Obama&#8221; moniker, as the most neutral modern president with the lowest anger scores in the recent era. G. W. Bush&#8217;s bouncy, theological optimism shines through, with the highest joy levels of any modern president (20.4%), a consistent strategy of optimism even as fear spiked measurably after 9/11. To my suprise, despite Biden&#8217;s message of unity, his rhetoric registers the highest anger scores of any modern Democrat, a reflection perhaps of governing under conditions of extreme polarization rather than his personal style.</p><h2><strong>The Power of Abstraction</strong></h2><p>Using AI models to classify discourse at scale in this way is no replacement for close reading, nor should we get too bogged down in the reductionism of transforming sentences into scores. What this kind of data does facilitate is a remarkable range of perspectives, from the sentence level, to macro-historical-emotional trends in presidential discourse. I believe the ability to traverse these different scales of analysis, and make valid, data-based comparisons between speeches, presidents, and periods, really augments our traditional close-reading. No human reader could detect structural patterns across 32,000 speeches consistently, or measure the long-run drift of political culture with any precision. This data offers us a quantitative window into the structural architecture of persuasion, the measurable drift of political culture toward emotionality, and the affective fingerprint of individual presidents&#8217; rhetoric.</p><p>The technology underlying this analysis is, in one sense, unremarkable: a general-purpose language model fine-tuned on a diverse corpus that includes Twitter posts, Reddit discussions, student essays, and television dialogue. But it is precisely this generality that makes it powerful. The same transformer architecture that classifies emotion can be trained to tag parts of speech, identify named entities, detect policy frames, extract claims, or track the spread of metaphors. The versatility and adaptability of AI Large Language Models is precisely what makes them such a transformative technology, and they hold tremendous potential for historians and historical research.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p><p></p><div><hr></div><p>Are you working on computational history? Do you want to share it? Please reach out.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/presidential-feelings?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/presidential-feelings?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p><div><hr></div><p><strong>Under the Hood</strong></p><p>What makes this project possible is the Huggingface repository that contains LLM models for anyone to use. Want to try it out? Here is code you can run on nearly any computer. If you need help getting it started, just copy the code below into Claude or ChatGPT and ask for help (you will need to install two packages called <code>transformers</code> and <code>torch</code>). A longer version that explains the code is below as well. </p><p>The code can be <em>very</em> easily adapted to run over an excel spreadsheet, a document collection, or nearly any other data source. Just ask an LLM to adapt it to that purpose. The LLM will even generate the code to make your own graphs! </p><p>The hardest part in computational history is knowing what is possible.</p><p><strong>Code</strong></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;4df0f7ad-a291-4cee-a593-41c85d20be73&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python"># Created: 2026-05-16 11:02
# Purpose: Run local emotion classification using Hugging Face Transformers.

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="j-hartmann/emotion-english-distilroberta-base",
    top_k=None
)

texts = [
    "I am thrilled about this project.",
    "I feel anxious and exhausted.",
    "This makes me furious.",
    "The results were unexpected."
]

for text in texts:
    print("\nTEXT:", text)

    scores = classifier(text)[0]

    for item in sorted(scores, key=lambda x: x["score"], reverse=True):
        print(f"{item['label']:&gt;10}: {item['score']:.3f}")</code></pre></div><p><strong>Here are what the results of this test code look like:</strong></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;903b1ad6-0342-4bf3-8bb0-4932f17924e7&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">TEXT: I am thrilled about this project.
       joy: 0.977
  surprise: 0.014
   neutral: 0.004
     anger: 0.002
      fear: 0.001
   sadness: 0.001
   disgust: 0.001

TEXT: I feel anxious and exhausted.
      fear: 0.993
   sadness: 0.002
   neutral: 0.001
  surprise: 0.001
     anger: 0.001
       joy: 0.001
   disgust: 0.000

TEXT: This makes me furious.
     anger: 0.981
   neutral: 0.007
   disgust: 0.005
      fear: 0.003
   sadness: 0.002
  surprise: 0.002
       joy: 0.000

TEXT: The results were unexpected.
  surprise: 0.947
   neutral: 0.023
       joy: 0.017
     anger: 0.005
      fear: 0.004
   sadness: 0.003
   disgust: 0.002</code></pre></div><p></p><p><strong>Code with Explainer (for the novice):</strong></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;2e2c5727-02c5-4120-919d-eef9dbb2bbba&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext"># Created: 2026-05-16 11:30
# Purpose:
# Run a local emotion-classification model using Hugging Face Transformers.
#
# -------------------------------------------------------------------
# FIRST-TIME SETUP
# -------------------------------------------------------------------
#
# 1. Create a Python virtual environment (recommended):
#
#    python -m venv emotions
#
#
# 2. Activate the virtual environment:
#
#    Mac/Linux:
#    source emotions/bin/activate
#
#    Windows:
#    emotions\Scripts\activate
#
#
# 3. Install required libraries:
#
#    pip install torch transformers
#
#
# 4. Run the script:
#
#    python emotions_test.py
#
#
# -------------------------------------------------------------------
# WHAT THIS SCRIPT DOES
# -------------------------------------------------------------------
#
# This script:
#
# 1. Loads a pretrained emotion model from Hugging Face
# 2. Downloads the model automatically the first time it runs
# 3. Stores the model in a local Hugging Face cache
# 4. Runs emotion classification on example sentences
# 5. Prints emotion probabilities sorted from highest to lowest
#
#
# -------------------------------------------------------------------
# WHAT THE MODEL PREDICTS
# -------------------------------------------------------------------
#
# The model predicts emotions such as:
#
# - joy
# - sadness
# - anger
# - fear
# - surprise
# - disgust
# - neutral
#
#
# -------------------------------------------------------------------
# IMPORTANT NOTES
# -------------------------------------------------------------------
#
# First run:
# - downloads model files from the internet
# - may take 1&#8211;5 minutes depending on internet speed
#
# Later runs:
# - load the model from local cache
# - start much faster
#
# Hugging Face cache location on Mac/Linux:
#
#    ~/.cache/huggingface/
#
#
# -------------------------------------------------------------------
# IMPORTS
# -------------------------------------------------------------------


# Import the Hugging Face "pipeline" helper.
#
# A pipeline is a high-level wrapper that:
#
# - loads the tokenizer
# - loads the neural network model
# - prepares the text
# - runs inference
# - formats the output
#
# This makes it possible to run modern NLP models
# in only a few lines of code.
from transformers import pipeline


# -------------------------------------------------------------------
# LOAD MODEL
# -------------------------------------------------------------------


# Create a text-classification pipeline.
#
# "text-classification" tells Transformers
# what kind of NLP task we want to perform.
#
# model="..." specifies which pretrained model to use.
#
# top_k=None tells the model to return ALL emotion scores,
# not just the highest-scoring emotion.
#
# On the first run:
# - Hugging Face downloads model files
# - files are cached locally
#
# Later runs load directly from cache.
classifier = pipeline(
    "text-classification",
    model="j-hartmann/emotion-english-distilroberta-base",
    top_k=None
)


# -------------------------------------------------------------------
# INPUT TEXTS
# -------------------------------------------------------------------


# Example texts to classify.
#
# The model estimates the emotional tone of each sentence.
texts = [
    "I am thrilled about this project.",
    "I feel anxious and exhausted.",
    "This makes me furious.",
    "The results were unexpected."
]


# -------------------------------------------------------------------
# RUN CLASSIFICATION
# -------------------------------------------------------------------


# Loop through each sentence.
for text in texts:

    # Print the original input text.
    print("\nTEXT:", text)

    # Run emotion classification.
    #
    # The pipeline returns a list of results.
    #
    # Since we pass only one sentence at a time,
    # we take the first item using [0].
    scores = classifier(text)[0]

    # Sort emotion scores from highest to lowest.
    #
    # Each item looks like:
    #
    # {
    #     "label": "joy",
    #     "score": 0.98
    # }
    #
    # The lambda function tells Python
    # to sort using the "score" value.
    for item in sorted(scores, key=lambda x: x["score"], reverse=True):

        # Print:
        # - emotion label
        # - probability score rounded to 3 decimals
        #
        # Example:
        #
        #       joy: 0.982
        print(f"{item['label']:&gt;10}: {item['score']:.3f}")</code></pre></div>]]></content:encoded></item><item><title><![CDATA[Beyond Cherry-Picking: Scaling Historical Arguments]]></title><description><![CDATA[Why Historians Must Enter the Age of Big Claims]]></description><link>https://computationalhistory.substack.com/p/beyond-cherry-picking-scaling-historical</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/beyond-cherry-picking-scaling-historical</guid><dc:creator><![CDATA[Jo Guldi]]></dc:creator><pubDate>Mon, 04 May 2026 12:46:01 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!zPjd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bffca1c-a38e-41f4-a45b-6f1e9a7b4e30_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Chris Phillips is absolutely right: statistics help historians think about how unique any given example is. They push us against cherry-picking, against the temptation to elevate the exceptional case, and against a subtler form of presentism in which we search the archive for a shiny precursor that mirrors today&#8217;s mood or movement. Those habits all have their place. But if we take seriously the older ambition, historia magistra vitae, then history also aspires to say something about what is normal.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/beyond-cherry-picking-scaling-historical?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/beyond-cherry-picking-scaling-historical?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p><br><br>In that respect, historians are no longer alone. Political scientists like <a href="https://scholar.harvard.edu/chenoweth">Erica Chenoweth</a> have assembled longue dur&#233;e datasets of nonviolent movements in order to generalize about their effectiveness over time. (Her conclusion: nonviolence wins more often than violence, though with important exceptions.) <a href="https://peterturchin.com/">Peter Turchin</a>, working with historians of the ancient world, has built the <a href="https://seshatdatabank.info/">Seshat: Global History Databank</a>, comparing technologies, wars, empires, and religions across millennia to produce arguments about the origins of cities, kingship, and even monotheism. <a href="https://fooledbyrandomness.com/">Nassim Nicholas Taleb</a>, whom I dined with this week, has compiled his own longue dur&#233;e dataset of wars and casualties in order to test claims about whether violence is declining over time.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zPjd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bffca1c-a38e-41f4-a45b-6f1e9a7b4e30_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zPjd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bffca1c-a38e-41f4-a45b-6f1e9a7b4e30_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!zPjd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bffca1c-a38e-41f4-a45b-6f1e9a7b4e30_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!zPjd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bffca1c-a38e-41f4-a45b-6f1e9a7b4e30_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!zPjd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bffca1c-a38e-41f4-a45b-6f1e9a7b4e30_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zPjd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bffca1c-a38e-41f4-a45b-6f1e9a7b4e30_1536x1024.png" width="427" height="284.7644230769231" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3bffca1c-a38e-41f4-a45b-6f1e9a7b4e30_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:427,&quot;bytes&quot;:2327688,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/196322146?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bffca1c-a38e-41f4-a45b-6f1e9a7b4e30_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zPjd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bffca1c-a38e-41f4-a45b-6f1e9a7b4e30_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!zPjd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bffca1c-a38e-41f4-a45b-6f1e9a7b4e30_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!zPjd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bffca1c-a38e-41f4-a45b-6f1e9a7b4e30_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!zPjd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bffca1c-a38e-41f4-a45b-6f1e9a7b4e30_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><br>These are not my methods. They may not be yours either. As a historian trained in social and cultural approaches, I want to know about individual lives, about moods and propaganda, about the lived experience and interpretation of war, not merely the number of dead or the count of technologies. There is, in these datasets, often too little of the texture that historians are trained to value.<br><br>And yet I admire these projects deeply. I admire their scale. I admire their willingness to enter public debate on questions that matter right now. Taleb began counting war dead in order to argue with <a href="https://en.wikipedia.org/wiki/The_Better_Angels_of_Our_Nature">Steven Pinker&#8217;s The Better Angels of Our Nature</a>. That conversation has drawn in historians as well, my coauthor <a href="https://scholar.harvard.edu/armitage/home">David Armitage</a> among them, though from a different angle, asking how categories like &#8220;civil war&#8221; themselves have been historically constructed and deployed rather than simply counted.<br><br>What unites Chenoweth, Turchin, and Taleb is their determination to generalize about war and peace, violence and nonviolence, questions that have always been central to historical inquiry. They offer clear, even binary answers to contested claims. Is nonviolence increasing? Chenoweth says yes, particularly in the modern period. Are wars becoming less stochastic? Taleb says no. If this is where our collective understanding of society is being formed, historians cannot afford to be absent.<br><br>What, then, makes historians different?<br><br>It is not a lack of rigor. It is not an allergy to numbers. It is our omnivorousness. Other disciplines pride themselves on agility, on the mastery of mathematics or philosophy. (I think, for instance, of <a href="https://marginalrevolution.com/">Tyler Cowen</a> celebrating the reach of economics.) But no discipline has the breadth of history at its best: statistics and mathematics alongside philosophy, the cultural turn&#8217;s engagement with art and meaning, social history&#8217;s grounding in linguistics and lived experience.<br><br>In my own work on the history of political economy through text mining, I have tried to build on questions from political economy and social history alike, using a sensitivity to memory encoded in language that depends on both historical linguistics and corpus linguistics. If the last wave of digital humanities was built on NLP, the LLM promises something more ambitious: the possibility of merging datasets like wages (as Louis Hyman has been documenting) with the arguments, experiences, and nuances embedded in historical texts.<br><br>The challenge is how to do that without losing sensitivity to individual lives, how to move from large-scale datasets to arguments about trends like the rise of nonviolence without flattening culture, decision, and imagination into mere counts.<br><br>Historians will not become economic historians overnight. But many are already motivated by questions, about capitalism, conflict, governance, that implicitly place them in dialogue with Pinker, Chenoweth, and Turchin. The question is whether we can engage those datasets without surrendering what we do best.<br><br>One answer lies in what I have called Critical Search (developed in The Dangerous Art of Text Mining). In practice, Critical Search treats datasets not as endpoints but as indices of change. Peaks, anomalies, and concentrations become invitations to investigate. The historian identifies a pattern, then zooms in, modeling individuals or events through text mining, and finally returning to close reading at the moment when everything begins to look different.<br><br>A concrete example: in my work on environmental rhetoric in Congress, I found that members of Congress routinely referred to environmentalists as &#8220;zealots,&#8221; &#8220;academicians,&#8221; and &#8220;radicals.&#8221; Counting these phrases allowed me to answer Phillips&#8217; question: how widespread were these denunciations? The answer was: not very. Roughly 90% of such attacks were produced by just six members of Congress, and overwhelmingly by one figure, <a href="https://www.senate.gov/artandhistory/history/common/generic/Featured_Bio_Stevens.htm">Ted Stevens</a>.<br><br>At that point, the project pivoted. The data identified Stevens not as a cherry-picked case, but as a statistically grounded exemplar. The next step was to zoom in: to read his speeches, trace his career, and understand his role as a defender of oil pipelines and a central voice opposing environmentalism from the 1970s through the 1990s and beyond. The result was a history that could sustain a general claim, environmentalism faced sustained attack in Congress, while grounding that claim in the detailed study of a particular actor and his evolving rhetoric.<br><br>This is how political and social history scale. Not by abandoning the case study, but by selecting it rigorously.<br><br>LLMs now make it possible to extend this method far beyond parliamentary debates. They allow historians to draw on longue dur&#233;e datasets, wars, casualties, nonviolent movements, wages, and to model change over time in ways that engage directly with the Pinkers, Turchins, and Talebs of the world. But crucially, they also allow us to use those datasets to identify moments of exception: the unusually violent, the unexpectedly peaceful, the early adopters of nonviolence.<br><br>Those moments, in turn, become the basis for historical explanation. They let us test the limits of quantitative claims by showing what is hidden until we zoom in. They offer a way to tell large-scale stories without surrendering detail. And they provide a more rigorous alternative to cherry-picking, not by abandoning selection, but by disciplining it.<br><br>That, I think, is the historian&#8217;s reply to statistics: not resistance, but integration on our own terms.</p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">If you learned something, please subscribe and share</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/beyond-cherry-picking-scaling-historical?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/beyond-cherry-picking-scaling-historical?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Hooray! Post-Fordism Is Finally Here!]]></title><description><![CDATA[AI, Monopoly Capitalism, and Open-weight Models]]></description><link>https://computationalhistory.substack.com/p/hooray-post-fordism-is-finally-here</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/hooray-post-fordism-is-finally-here</guid><dc:creator><![CDATA[Louis Hyman]]></dc:creator><pubDate>Wed, 29 Apr 2026 12:46:06 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!j6Mr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4ac97ba-2da9-4c05-80e8-fa90b4498fc5_1388x642.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>While I want this substack to mostly be about show-and-tell, I am still an economic historian. Often I am drawn into conversations about &#8220;what AI means&#8221; and so I thought it would be useful to be clear.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p><p>As I see this debate, this question of our age, there are two main questions that history can shed some light on.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> </p><ol><li><p>Is AI a complement or a substitute for labor? That is, will it increase demand for and the productivity of workers, or decrease it?</p></li><li><p>Will AI be controlled by the few or be accessible to the many?</p><p></p></li></ol><p><strong>A Complement or a Substitute?</strong></p><p>Consider a some of the most important technologies of the past 200 years.</p><p>When I am asked about what <a href="https://en.wikipedia.org/wiki/Automation">automation</a> might look like, I inevitably discuss agriculture. Roughly all of our ancestors were farmers and approximately none of us today are. Yet we still eat bread made from wheat. That shift is possible because of automation.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!j6Mr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4ac97ba-2da9-4c05-80e8-fa90b4498fc5_1388x642.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!j6Mr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4ac97ba-2da9-4c05-80e8-fa90b4498fc5_1388x642.png 424w, https://substackcdn.com/image/fetch/$s_!j6Mr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4ac97ba-2da9-4c05-80e8-fa90b4498fc5_1388x642.png 848w, https://substackcdn.com/image/fetch/$s_!j6Mr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4ac97ba-2da9-4c05-80e8-fa90b4498fc5_1388x642.png 1272w, https://substackcdn.com/image/fetch/$s_!j6Mr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4ac97ba-2da9-4c05-80e8-fa90b4498fc5_1388x642.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!j6Mr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4ac97ba-2da9-4c05-80e8-fa90b4498fc5_1388x642.png" width="1388" height="642" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4ac97ba-2da9-4c05-80e8-fa90b4498fc5_1388x642.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:642,&quot;width&quot;:1388,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!j6Mr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4ac97ba-2da9-4c05-80e8-fa90b4498fc5_1388x642.png 424w, https://substackcdn.com/image/fetch/$s_!j6Mr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4ac97ba-2da9-4c05-80e8-fa90b4498fc5_1388x642.png 848w, https://substackcdn.com/image/fetch/$s_!j6Mr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4ac97ba-2da9-4c05-80e8-fa90b4498fc5_1388x642.png 1272w, https://substackcdn.com/image/fetch/$s_!j6Mr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4ac97ba-2da9-4c05-80e8-fa90b4498fc5_1388x642.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The <a href="https://en.wikipedia.org/wiki/Threshing_machine">mechanical thresher</a>, used to process wheat, was a substitute for the most backbreaking work of the harvest. But it also enabled more land to be cultivated, and that land was cultivated more efficiently, allowing for greater harvests. Mechanization of the farm, like the thresher, turned the American Midwest into the breadbasket of the world.</p><p>Those displaced farmers found work on railroads, moving all that. And those jobs, according to people at the time, were a kind of liberation from the raw animal labor of threshing. On net, it created demand for more workers at better wages in work more fit for people than beasts. For those that remained farmers, they found other higher-value work to be done. On a farm, there is always more work to do.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pnu1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4bdd62-c6cb-4ee7-a236-1d1879236802_600x793.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pnu1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4bdd62-c6cb-4ee7-a236-1d1879236802_600x793.jpeg 424w, https://substackcdn.com/image/fetch/$s_!pnu1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4bdd62-c6cb-4ee7-a236-1d1879236802_600x793.jpeg 848w, https://substackcdn.com/image/fetch/$s_!pnu1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4bdd62-c6cb-4ee7-a236-1d1879236802_600x793.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!pnu1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4bdd62-c6cb-4ee7-a236-1d1879236802_600x793.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pnu1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4bdd62-c6cb-4ee7-a236-1d1879236802_600x793.jpeg" width="284" height="375.35333333333335" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b4bdd62-c6cb-4ee7-a236-1d1879236802_600x793.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:793,&quot;width&quot;:600,&quot;resizeWidth&quot;:284,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pnu1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4bdd62-c6cb-4ee7-a236-1d1879236802_600x793.jpeg 424w, https://substackcdn.com/image/fetch/$s_!pnu1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4bdd62-c6cb-4ee7-a236-1d1879236802_600x793.jpeg 848w, https://substackcdn.com/image/fetch/$s_!pnu1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4bdd62-c6cb-4ee7-a236-1d1879236802_600x793.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!pnu1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4bdd62-c6cb-4ee7-a236-1d1879236802_600x793.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">By Ransomes, Sims &amp; Jefferies Ltd - Ransomes, Sims &amp; Jefferies Ltd. advertising poster, The Museum of English Rural Life, The University of Reading, UK, c.1875, Public Domain, <a href="https://commons.wikimedia.org/w/index.php?curid=31386254">https://commons.wikimedia.org/w/index.php?curid=31386254</a></figcaption></figure></div><p></p><p>The failure, then and now, is to think farmers were only threshers. That was one part of their jobs. Today, our work, for most people, is also a bundle of tasks. Workers then and now could and can focus on parts of their job that are of higher value. And in a new economy, new tasks in new industries will be created. Many of the jobs that we do today (web designer, UI expert) were simply unimaginable in 1850. That is a good thing.</p><p>Consider now the <a href="https://en.wikipedia.org/wiki/Assembly_line">assembly line</a>. I&#8217;m sure you all know about the staggering increases in productivity that come from the <a href="https://en.wikipedia.org/wiki/Division_of_labour">division of labor</a>. If you take my class in industrial history, you would learn deeply about the story of the automobile. With the assembly line, and no other change in technology, car assembly went from 12 and a half hours to about 30 minutes (once they worked out the kinks). Did this reduce the demand for workers? No. It reduced the price of cars. And that increased the demand for workers, who eventually could demand even higher wages through <a href="https://en.wikipedia.org/wiki/Trade_union">unionization</a>.</p><p>It is important here to realize that better tools don&#8217;t make us get paid worse. They generally make us get paid more. Why? Because the tool, without the person, is useless. Even for today&#8217;s most cutting-edge AIs, that is true. It can code, but it can only code what I imagine it to code. It can draw, but only what I imagine it to draw. That is true for AIs as it was true for the thresher.</p><p>So, I would offer that AI will create more growth, more abundance. In the long run, all growth comes from higher productivity.</p><p>I would add one more piece to this story. Economic inequality has worsened since roughly 1970. It has worsened, therefore, not in the industrial era, but the digital era. I have <a href="https://www.nytimes.com/2023/04/22/opinion/jobs-ai-chatgpt.html">argued elsewhere</a> that this happened because for decades we did not use computers as tools of automation but as glorified typewriters (and then as televisions). Our productivity did not increase, especially to justify the expense of computers. Economists have debated for decades now over the lack of increase in productivity that came with the &#8220;digital age&#8221; of computing, but it is simple. We don&#8217;t use them as computers. Now we can. </p><p>For the first time now, normal people with their normal problems can use their computers to solve and automate their problems. AI can write code. AI can automate their tedium. The digital age did not bring any gains because it had no yet arrived. We were living through the last gasp of the industrial economy.</p><p>It is now here.</p><p>This technology will unleash unimaginable productivity gains. It will level the playing field between coders and the rest of us. Coders will lose their jobs, to be sure, but for the rest of us, the bundle of workplace tasks will become much better.</p><p>And truthfully, the demand for <em>real</em> computer scientists will probably increase in the era of vibe-coding. Computer science itself is a bundle of skills, of which coding is just one. The more important skill &#8211; software and data architecture &#8211; will only <em>increase </em>in demand as the usefulness of software expands.</p><p><strong>Monopoly or Market?</strong></p><p>Isn&#8217;t all this talk about AI well and good, but professor, won&#8217;t it be expensive? Won&#8217;t large corporations like OpenAI just monopolize it? This fear of a &#8220;<a href="https://en.wikipedia.org/wiki/Monopoly">monopoly</a>&#8220; is that it would become a price-gouger, a tax on the rest of the economy. I do not think this fear is reasonable.</p><p>The threats of monopoly are overstated, not simply because such monopolies in the last century never persisted for more than a few years (either because of regulation or competition), but because, on the ground now, AI doesn&#8217;t seem like it is heading in that direction.</p><p>More than people realize, the software is in a competitive market, and at the same time, the hardware is coming down in price.</p><p><strong>Software</strong></p><p>Most people have heard of <a href="https://en.wikipedia.org/wiki/ChatGPT">ChatGPT</a> some have heard of <a href="https://en.wikipedia.org/wiki/Claude_(language_model)">Claude</a>, and they both cost money to use. But there are a range of AIs that are <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source</a> and can be freely used. Just go to <a href="https://en.wikipedia.org/wiki/Hugging_Face">huggingface.co</a>m (which is a silly name) and you can download hundreds of different <a href="https://en.wikipedia.org/wiki/Large_language_model">large language models</a>. Now, is the current version of ChatGPT better than most? Sure. But it isn&#8217;t actually better than the best open weight large language model, <a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro">DeepSeek-V4-Pro-Max</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8ClH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268135af-a53b-4fff-8a0d-568f697e5bcc_1252x416.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8ClH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268135af-a53b-4fff-8a0d-568f697e5bcc_1252x416.png 424w, https://substackcdn.com/image/fetch/$s_!8ClH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268135af-a53b-4fff-8a0d-568f697e5bcc_1252x416.png 848w, https://substackcdn.com/image/fetch/$s_!8ClH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268135af-a53b-4fff-8a0d-568f697e5bcc_1252x416.png 1272w, https://substackcdn.com/image/fetch/$s_!8ClH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268135af-a53b-4fff-8a0d-568f697e5bcc_1252x416.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8ClH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268135af-a53b-4fff-8a0d-568f697e5bcc_1252x416.png" width="1252" height="416" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/268135af-a53b-4fff-8a0d-568f697e5bcc_1252x416.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:416,&quot;width&quot;:1252,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8ClH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268135af-a53b-4fff-8a0d-568f697e5bcc_1252x416.png 424w, https://substackcdn.com/image/fetch/$s_!8ClH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268135af-a53b-4fff-8a0d-568f697e5bcc_1252x416.png 848w, https://substackcdn.com/image/fetch/$s_!8ClH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268135af-a53b-4fff-8a0d-568f697e5bcc_1252x416.png 1272w, https://substackcdn.com/image/fetch/$s_!8ClH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268135af-a53b-4fff-8a0d-568f697e5bcc_1252x416.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>To run this model, you need very powerful hardware. Less powerful models (with lower memory requirements) can run on my Mac, <a href="https://computationalhistory.substack.com/p/on-the-virtue-of-small-ai">as I wrote about earlier this week. </a></p><p>The larger point is this: the open-source models are already nearly as good as the closed-sourced models. And the open-source models of today are much, much better than the closed-sourced models of a year ago. So if you are willing to sacrifice a few months of progress, you can have the best models for free right now.</p><p>Even the <a href="https://docs.ollama.com/integrations/claude-code">best agentic coding tools</a> can rely on open-source models. Open-source groups are pushing back on other kinds of possible monopoly moats, like <a href="https://docs.langchain.com/oss/python/concepts/memory">agentic memory</a>. Perhaps some other feature of AI will emerge that enables a monopoly, like a god-like <a href="https://en.wikipedia.org/wiki/Artificial_general_intelligence">artificial general intelligence</a> (AGI), but barring that apotheosis, competition seems to be the most likely path.</p><p><strong>Hardware</strong></p><p>The question then is who has access to the chips? Right now, the <a href="https://en.wikipedia.org/wiki/Graphics_processing_unit">GPUs</a> you need to run the big models, like the new <a href="https://en.wikipedia.org/wiki/DeepSeek">DeepSeek</a>, are tens of thousands of dollars.</p><p>Here we come back to whether or not you think hardware will be a barrier to using the LLMs.</p><p>Historically, chip power has increased even as chip prices fall. That has been the case for decades; it is called <a href="https://en.wikipedia.org/wiki/Moore%27s_law">Moore&#8217;s Law</a>: the number of transistors per integrated circuit doubles every two years for a constant price. For <a href="https://en.wikipedia.org/wiki/Central_processing_unit">CPUs</a>, there has been some fear that it would no longer be possible, as we are starting to come up against basic physics. For GPUs, which power AI, that is not true. In fact, it is the opposite. In the last decade <a href="https://en.wikipedia.org/wiki/Nvidia">NVIDIA</a> GPUs have sped up even faster than Moore&#8217;s Law. What does this mean? If they only follow Moore&#8217;s Law, AIs that currently need $10,000 chips will be exponentially cheaper in ten years&#8212;only $300.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> But they will be cheaper than that. GPU operations per dollar are doubling in more like 18 months instead of two years so that $10,000 GPU would only cost $98.</p><p>The retort&#8212;&#8220;but won&#8217;t the models be bigger?&#8221;&#8212;misses what is happening. Models are becoming smaller and more efficient. The models today are the dumbest and largest they will ever be.</p><p>Moreover, with the rise of <a href="https://en.wikipedia.org/wiki/Apple_silicon">Apple silicon</a> chips with integrated memory, like the M5, we can easily run AI locally on consumer-grade hardware. Those AI aren&#8217;t frontier models like <a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro">DeepSeek-V4-Pro-Max</a> but they are shockingly good. You can take a big model and make it much smaller&#8212;&#8220;<a href="https://en.wikipedia.org/wiki/Quantization_(signal_processing)">quantize it</a>&#8221;&#8212;and still retain a large fraction of its capacity. These capacities will only grow in the coming years. You will not need a data center for everyday work.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a></p><p><strong>Post-Fordist Computing and Long-Tail Digital Markets</strong></p><p>Put together we come to a very different picture of what the digital age will be. The industrial age required massive investments to build the factories to make the products that were in demand. In the digital age, in contrast, the factories to build digital products will be made by the AI on your laptop. That is not inequality. That is equality.</p><p>The physical products of the Fordist industrial age were made for the mass market. In contrast, the digital products of the <a href="https://en.wikipedia.org/wiki/Post-Fordism">post-fordist</a> digital age will be <a href="https://en.wikipedia.org/wiki/Long_tail">long-tail</a> products. I don&#8217;t need to make mass market products; I can make them for a small niche, or just for myself.</p><p>Rather than fostering inequality, AI, then, is a great equalizer. To make products for a global market you don&#8217;t need a billion-dollar factory. You just need a laptop. That is astonishing.</p><p>That said, it will not be all sunshine and rainbows. Will AI solve the inequities of capitalism or its reliance on externalities as a source of primitive accumulation?<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a>  Probably not. </p><p>But at the same time, <em>AI is not a normal technology</em> in that it has the potential to radically undermine many of the tendencies to concentrate capital that we have seen in the industrial age. We have been automated out of work before, that is nothing new, but it has always concentrated capital in the hands of the few. For the first time, there is potentially an alternative path forward. </p><p>AI will bring the digital age out of the hands of the coders. AI will not widen the gap&#8212;it will bridge it. Its ubiquity will mean that AI will be a tool that nearly all of us will be able to use in our daily work, which will make ordinary people more productive and prosperous.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/hooray-post-fordism-is-finally-here?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Did you find something interesting? Share with your socials or a friend</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/hooray-post-fordism-is-finally-here?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/hooray-post-fordism-is-finally-here?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p><p></p><p></p><p></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>This essay is based on a friendly public debate from fall 2025 between me and my amazing colleague (and AI legend) Rama Chellappa at Johns Hopkins University. He is not in any way responsible for my half-baked ideas. He was far more pessimistic. </p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Perhaps the third question is energy and ecology. I don&#8217;t find this debate particularly interesting because the answer is so clear: disallow data centers from connecting to the legacy grid. Require them to use non-carbon energy sources. This regulation would push AI firms to innovate and lower costs in green technology like nuclear, solar, wind, and geothermal. Data centers can be located anywhere, and should be located in places where green energy is abundant, not suburban Virginia.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Our World in Data, &#8220;Share of Agriculture in Total Employment,&#8221; based on Berthold Herrendorf, Richard Rogerson, and &#193;kos Valentinyi, &#8220;Growth and Structural Transformation,&#8221; in Handbook of Economic Growth, vol. 2B, ed. Philippe Aghion and Steven N. Durlauf (Amsterdam: Elsevier, 2014), 855-941; U.S. historical series from Susan B. Carter et al., eds., Historical Statistics of the United States: Earliest Times to the Present, Millennial Edition (New York: Cambridge University Press, 2006), table Ba814-830, and U.S. Bureau of Economic Analysis, National Income and Product Accounts, table 6.8, &#8220;Persons Engaged in Production by Industry,&#8221; accessed April 27, 2026, https://ourworldindata.org/grapher/share-of-agriculture-in-total-employment.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>The 1840, 1850, and 1860 labor-force estimates include enslaved workers. They should be read as all workers by sector. Susan B. Carter and Richard Sutch, &#8220;Labor in the United States, 1800-2000,&#8221; Historical Statistics of the United States, Millennial Edition Online, working paper version, University of California, Riverside, 2004, 10-12, https://economics.ucr.edu/papers/papers04/04-03.pdf.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>That number is shocking. Here is the math. 5 doublings (2-year Moore&#8217;s Law over 10 years): $10,000 / 2^5 = $10,000 / 32 = $312.50 &#8594; &#8220;about $300&#8221;</p><p>6.67 doublings (18-month doubling over 10 years): 120 months / 18 = 6.67; 2^6.67 &#8776; 101.6; $10,000 / 101.6 &#8776; $98.4</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>A distinction here between &#8220;training&#8221; compute and &#8220;inference&#8221; computer would be useful. Training compute enables the creation of new frontier models. That kind of power requires a data center. Inference compute, where you just run a model, can be done locally. Moreover, the &#8220;fine-tuning&#8221; of a model, where you build a small &#8220;adapter&#8221; to put in front of a big model, can be done in a few hours on a Mac M4 Max right now.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>I am constantly struck by how the latent resources&#8212;the externalities&#8212;of the digital age (documents, data) compare to the latent resources of the industrial age (coal, aluminum).</p></div></div>]]></content:encoded></item><item><title><![CDATA[On The Virtue of Small AI]]></title><description><![CDATA[Ollama and Hugging Face]]></description><link>https://computationalhistory.substack.com/p/on-the-virtue-of-small-ai</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/on-the-virtue-of-small-ai</guid><dc:creator><![CDATA[Louis Hyman]]></dc:creator><pubDate>Mon, 27 Apr 2026 19:43:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/d40f5697-d5a1-4830-bd29-0a96c2630edb_325x137.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A lot of folks are understandably worried about giving money to <a href="https://en.wikipedia.org/wiki/OpenAI">OpenAI</a>, and to a lesser extent, <a href="https://en.wikipedia.org/wiki/Anthropic">Anthropic</a>. The reasons vary from expense to privacy to environmentalism to copyright to even fears of the <a href="https://en.wikipedia.org/wiki/Instrumental_convergence#Paperclip_maximizer">paperclip problem</a>. Whatever the reason, cloud AI has become a lightning rod for a new kind of resistance to Big Tech.</p><p>But what if I told you there was another way to use AI?</p><p>For the most technically savvy historians, the answer is obvious. However, I have found in my many conversations that the alternative&#8212;free, open-source, local&#8212;is little known. And these little AIs, strangely enough, often work best for our historical purposes.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/on-the-virtue-of-small-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/on-the-virtue-of-small-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2>Free AI on Your Mac</h2><p>Now, for whatever reason, <a href="https://en.wikipedia.org/wiki/Apple_Inc.">Apple</a> has become the beloved computing device of people who would rather overlook its <a href="https://techcrunch.com/2019/07/30/inside-the-history-of-silicon-valley-labor-with-louis-hyman/">rather suspect labor history</a>. I myself now own a MacBook, and I own it for one reason: it can run AI.</p><p><a href="https://en.wikipedia.org/wiki/Large_language_model">LLMs</a> run best on <a href="https://en.wikipedia.org/wiki/Graphics_processing_unit">GPUs</a> with large amounts of <a href="https://en.wikipedia.org/wiki/Random-access_memory">RAM</a>. Until the last few years, GPUs were mostly used for gaming (which required lots of fast calculations to render animations) and for <a href="https://en.wikipedia.org/wiki/Bitcoin">Bitcoin</a> (which required fast calculations to undermine the global currency regime). It turned out, unexpectedly, that these same GPUs were good for doing the calculations necessary for LLMs to run.</p><p>Unlike PCs, which separate the memory for the <a href="https://en.wikipedia.org/wiki/Central_processing_unit">CPU</a> from the GPU, Macs have <a href="https://en.wikipedia.org/wiki/Apple_silicon#Unified_memory">integrated memory</a>. So even PCs with lots of RAM generally can&#8217;t run very large LLMs without a specialized chip. On Macs, the RAM that you have runs everything. So on my Mac (a luxe <a href="https://en.wikipedia.org/wiki/Apple_M4">M4 Max</a> with 128 GB of RAM) I can run nearly any model available. Even lower-end Macs can run meaningfully sized models. 8 GB is too little; 16 GB lets you run small models (3&#8211;8B parameters) usefully. 32 GB and up opens the door to the genuinely smart ones. Disk space can be eaten quickly (each model can be tens of GB) so be careful.</p><h2>Local AI</h2><p>A few years ago, when I started playing with LLMs, you needed to get seriously under the hood. Long was the night when I monkeyed around with models I downloaded from <a href="https://en.wikipedia.org/wiki/Hugging_Face">Hugging Face</a> and tried to get their idiosyncratic details running in <a href="https://en.wikipedia.org/wiki/Python_(programming_language)">Python</a>. I did this almost exclusively on the big computing cluster at <a href="https://en.wikipedia.org/wiki/Johns_Hopkins_University">Hopkins</a>, and while I could get it working&#8212;and <a href="https://aclanthology.org/2025.nlp4dh-1.21/">even did some cool research</a> on <a href="https://en.wikipedia.org/wiki/Optical_character_recognition">OCR</a>&#8212;it was a gigantic hassle.</p><p>Hugging Face has got to go down as the silliest name in economic history. And it will go down in economic history because it is a vast repository of free, <a href="https://en.wikipedia.org/wiki/Open-weight_model">open-weight models</a>. I want to use a historical analogy, but at no point were steam engines or assembly lines freely given away. You can download the &#8220;weights&#8221; of LLM models (which are the important parts) and do whatever you want with them.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> End of story. It is pretty amazing. These models are not exactly the cutting-edge &#8220;frontier&#8221; models of OpenAI and Claude, but they are pretty dang close.</p><p>That said, it can be hard to learn how to use these models. The documentation, while extensive, is pretty alienating.</p><p>Here I offer a brief aside (and an apology for the intended pun): the Hugging Face founders apparently named it after this emoji &#129303;, which depicts a face with a hug. Aww. Of course, I assumed it was named after the semi-larval monster from <em><a href="https://en.wikipedia.org/wiki/Alien_(franchise)">Alien</a></em>, the facehugger. Future literary scholars will make a lot of this slippage, I think.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QqO9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad484aeb-8cf5-4512-a0a8-023fc5b884e8_325x137.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QqO9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad484aeb-8cf5-4512-a0a8-023fc5b884e8_325x137.png 424w, https://substackcdn.com/image/fetch/$s_!QqO9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad484aeb-8cf5-4512-a0a8-023fc5b884e8_325x137.png 848w, https://substackcdn.com/image/fetch/$s_!QqO9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad484aeb-8cf5-4512-a0a8-023fc5b884e8_325x137.png 1272w, https://substackcdn.com/image/fetch/$s_!QqO9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad484aeb-8cf5-4512-a0a8-023fc5b884e8_325x137.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QqO9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad484aeb-8cf5-4512-a0a8-023fc5b884e8_325x137.png" width="325" height="137" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ad484aeb-8cf5-4512-a0a8-023fc5b884e8_325x137.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:137,&quot;width&quot;:325,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;undefined&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="undefined" title="undefined" srcset="https://substackcdn.com/image/fetch/$s_!QqO9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad484aeb-8cf5-4512-a0a8-023fc5b884e8_325x137.png 424w, https://substackcdn.com/image/fetch/$s_!QqO9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad484aeb-8cf5-4512-a0a8-023fc5b884e8_325x137.png 848w, https://substackcdn.com/image/fetch/$s_!QqO9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad484aeb-8cf5-4512-a0a8-023fc5b884e8_325x137.png 1272w, https://substackcdn.com/image/fetch/$s_!QqO9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad484aeb-8cf5-4512-a0a8-023fc5b884e8_325x137.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The larger point is that this repository, while amazing for the technically inclined, can be daunting.</p><p>A novice user would be better served using <a href="https://en.wikipedia.org/wiki/Ollama">Ollama</a>. Ollama started as a way to use the free models released by <a href="https://en.wikipedia.org/wiki/Meta_Platforms">Meta/Facebook</a> called <a href="https://en.wikipedia.org/wiki/Llama_(language_model)">Llama</a>, but it has since expanded. If you look at its library, you can see many easy-to-use models. You can download the app and it runs like a chatbot on your computer.</p><p>For the more advanced user, Ollama has a great feature: Ollama server. You start up an Ollama server on your computer and you can interact with it locally like you would with <a href="https://en.wikipedia.org/wiki/ChatGPT">ChatGPT</a> or <a href="https://en.wikipedia.org/wiki/Claude_(language_model)">Claude</a>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> In this way, Ollama has <em>very easy</em> integration into other Python code, via an <a href="https://en.wikipedia.org/wiki/API">API</a>.</p><p>Not having to understand how to run the model is the entire point of ChatGPT or Claude&#8212;and you can do it on your computer for free.</p><p>The models you can run are pretty good at a range of tasks. More importantly, unless you are in a rush, they are an easy way to scale up your projects. You can let <a href="https://ollama.com/library/llama3.2-vision">Llama</a> do OCR for you over a week if you want. You can run Claude Code with <a href="https://ollama.com/library/qwen3-coder-next">Qwen</a>, no tokens needed. In the last few months, moreover, we have seen an explosion of models specialized for the Mac that use &#8220;<a href="https://en.wikipedia.org/wiki/MLX_(software)">MLX</a>&#8220; and have really increased token-generation speed.</p><p>Ollama addresses many of the key concerns that critics have. Your data stays local. <a href="https://en.wikipedia.org/wiki/Apple_silicon">Apple Silicon</a> is more <a href="https://scalastic.io/assets/img/cuda-vs-silicon-efficiency-en-980-4d75d9153.avif">power efficient</a> than anything in the cloud (which runs on <a href="https://en.wikipedia.org/wiki/Nvidia">NVIDIA</a> GPUs). You aren&#8217;t handing over money to potential monopolists. You aren&#8217;t supporting a <a href="https://en.wikipedia.org/wiki/Artificial_general_intelligence">potential robot uprising</a>.</p><p>And it is just cool to have it right there on your computer.</p><h2>The Virtue of Small Models</h2><p>When I am writing code to accomplish history tasks (OCR, OCR correction, fact-checking, and the like), I <em>always</em> run experiments. The assumption I used to have was that the biggest, newest model was the best model to use.</p><p>That is incorrect.</p><p>Instead, what I often do is break my processes into steps. For instance, I often want to read documents and pull out structured information. I used to do this in one step with an expensive API call to OpenAI. Nowadays, I break that process into steps. It works better, and it is cheaper. Each step&#8212;OCR, OCR correction, <a href="https://en.wikipedia.org/wiki/Named-entity_recognition">named-entity recognition</a>, <a href="https://en.wikipedia.org/wiki/JSON">JSON</a> cleaning&#8212;uses a different model. I use a big, smart model to check the results (as well as spot-checking the results myself), but I often find that older, smaller models (like <code>qwen-2.5:7b</code>) do a better job than the recent big boys.</p><p>As you integrate Ollama into your workflow, experiment with what works. It is free.</p><h2>Next Steps</h2><p>Get a Mac if you don&#8217;t have one. Download Ollama.</p><p>Use ChatGPT or Claude one last time to explain how to get it running, and don&#8217;t look back.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/on-the-virtue-of-small-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/on-the-virtue-of-small-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p><p></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Weights are the trained parameters of a model&#8212;the numerical values that determine how it responds to input.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>It should also be noted, for the technical reader, that running a <a href="https://en.wikipedia.org/wiki/Docker_(software)">Docker</a> or <a href="https://en.wikipedia.org/wiki/Singularity_(software)">Singularity</a> instance of Ollama on a <a href="https://en.wikipedia.org/wiki/High-performance_computing">high-performance cluster</a> allows a level of abstraction that is very useful, especially on locked-down HPCs that don&#8217;t allow you to install software.</p></div></div>]]></content:encoded></item><item><title><![CDATA[The Historian's Guide to Statistics]]></title><description><![CDATA[The ecological fallacy as cautionary tale for historical reasoning]]></description><link>https://computationalhistory.substack.com/p/the-historians-guide-to-statistics</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/the-historians-guide-to-statistics</guid><dc:creator><![CDATA[Christopher Phillips]]></dc:creator><pubDate>Fri, 24 Apr 2026 15:39:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kQZ-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6708a93e-3eb2-4ff9-a071-4aa09e63fe40_827x487.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There was once a time when historians were expected to have some facility with numbers. Not at the level of economists or sociologists, but at a still respectable level as fellow social scientists. Yes, this was in part a shift marked by the rise of cliometrics, but the expectation started earlier, as represented by the appearance in 1971 of the textbook <em>Historian&#8217;s Guide to Statistics</em> by Charles Dollar and Richard Jensen.</p><p>Dollar, then at Oklahoma State, was one of the first historians to use electronic computers during his graduate research. Fresh off that experience, in the mid-1960s, he and Jensen (then at Washington University in St. Louis) joined forces with a group based at the University of Pittsburgh to create the &#8220;<a href="https://www.tandfonline.com/journals/vzhm20">Historical Methods Newsletter: Quantitative Analysis of Social, Economic and Political Developmen</a>t&#8221; (a venture which continues today under a slightly modified name). At the time they were both just starting their careers, and saw in computers the possibility of new methods for quantitative analysis, particularly using publicly available government data from elections and censuses. Jensen would continue to work in universities (mainly at the University of Illinois-Chicago), but Dollar would go on to transform the use of machine-readable records at the National Archives in a <a href="https://collections.lib.uwm.edu/digital/collection/saa/id/32/">long and distinguished career</a> there.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/the-historians-guide-to-statistics?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Please share with a friend!</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/the-historians-guide-to-statistics?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/the-historians-guide-to-statistics?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p>Their work with colleagues on the <em>Historical Methods Newsletter</em> stands as a reminder that &#8220;quantitative history&#8221; and &#8220;computational history&#8221; has a much longer and more robust lineage than most current historians remember or acknowledge. In fact, Dollar and Jensen saw enough potential&#8212;but also the need for robust re-training&#8212;that they published <em>Historian&#8217;s Guide to Statistics</em> as an essential primer for colleagues who wished to utilize new computational and quantitative methods. Jensen wrote the statistical chapters and Dollar wrote the data and computing chapters; together they aimed to provide colleagues the methods both to do the analysis and to understand what it might mean.</p><p>The parts of the book focused on computing have understandably not aged particularly well, but their discussion of quantitative methods for historians remains fascinating and surprisingly relevant. The first line of the introduction tackles the challenge their methods supposedly pose to &#8220;traditional&#8221; histories head on: &#8220;Two misconceptions threaten to impede the use of quantitative methods by historians: fears of dehumanized history and mistrust of an alien methodology.&#8221; They conclude that the fears and mistrust are rooted in misunderstanding&#8212;one that their book hopes to dispel&#8212;but argue it isn&#8217;t likely that new methods will ultimately displace the &#8220;heritage&#8221; and &#8220;traditions&#8221; of close work with sources. Rather, &#8220;There is no royal road to historical insight, but we believe quantitative methods can often speed the trip&#8221; (1).</p><p>Citing Frederick Jackson Turner, Charles Beard, Arthur Schlesinger, and others, they note that quantitative studies (already by 1971!) had a very old pedigree among professional historians, going well back to the nineteenth century. But such a lineage had been lost by the 1950s, and historians who wished to engage quantitative methods then needed to consult their colleagues in the burgeoning social science departments across universities. In a sense, computational historians of the twenty-first century find themselves in a similar situation, aware of a long lineage but also feeling as if the lineage was lost in the 1980s and 1990s and now finding themselves needing to consult colleagues in statistics and computer science.</p><p>The <em>Historian&#8217;s Guide </em>begins a justification of statistical methods with basic examples of representativeness, with the selection of evidence that all historians face, and then build from there to the making of inferences and conclusions from incomplete or probabilistic data. In other words, all historians must think statistically even if they don&#8217;t realize it.</p><p>Many of the methods in the book are now standard within introductory statistics courses (such courses were extremely rare for non-social scientists in the 1960s and 1970s), but they also draw on a few more advanced topics which might have special relevance for historians, then and now. One such topic is the &#8220;<a href="https://en.wikipedia.org/wiki/Ecological_fallacy">ecological fallacy</a>.&#8221; <a href="https://doi.org/10.2307/2087176">Coined in 1950</a> by sociologist W.S. Robinson in the American Sociological Review, the term refers to the <em>slippage from measures of correlation among groups to conclusions about individuals in those groups</em>. The slippage is common because we often have data on groups but we are ultimately interested in the behavior of individuals. In slipping from one to the other, however, researchers have to be very careful.</p><p>In Dollar and Jensen&#8217;s account, the paradigmatic example was from political behavior. They ask readers to consider a study of Catholic voting behavior in which we know both the breakdown of Catholics-Non-Catholics as well as Dem-GOP votes. One might be tempted to chart the relationship between the percentage of Catholics in a precinct and the percentage of Democrats for which one observed the following data (99):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dSmy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b8c9442-5db5-4d4f-828e-4f5178616aeb_591x268.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dSmy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b8c9442-5db5-4d4f-828e-4f5178616aeb_591x268.png 424w, https://substackcdn.com/image/fetch/$s_!dSmy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b8c9442-5db5-4d4f-828e-4f5178616aeb_591x268.png 848w, https://substackcdn.com/image/fetch/$s_!dSmy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b8c9442-5db5-4d4f-828e-4f5178616aeb_591x268.png 1272w, https://substackcdn.com/image/fetch/$s_!dSmy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b8c9442-5db5-4d4f-828e-4f5178616aeb_591x268.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dSmy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b8c9442-5db5-4d4f-828e-4f5178616aeb_591x268.png" width="591" height="268" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4b8c9442-5db5-4d4f-828e-4f5178616aeb_591x268.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:268,&quot;width&quot;:591,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dSmy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b8c9442-5db5-4d4f-828e-4f5178616aeb_591x268.png 424w, https://substackcdn.com/image/fetch/$s_!dSmy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b8c9442-5db5-4d4f-828e-4f5178616aeb_591x268.png 848w, https://substackcdn.com/image/fetch/$s_!dSmy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b8c9442-5db5-4d4f-828e-4f5178616aeb_591x268.png 1272w, https://substackcdn.com/image/fetch/$s_!dSmy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b8c9442-5db5-4d4f-828e-4f5178616aeb_591x268.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And then use the data to plot the following &#8220;ecological&#8221; (i.e., aggregate) relationship between being Catholic and voting Democratic (101) using X and Y as axes:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kQZ-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6708a93e-3eb2-4ff9-a071-4aa09e63fe40_827x487.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kQZ-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6708a93e-3eb2-4ff9-a071-4aa09e63fe40_827x487.png 424w, https://substackcdn.com/image/fetch/$s_!kQZ-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6708a93e-3eb2-4ff9-a071-4aa09e63fe40_827x487.png 848w, https://substackcdn.com/image/fetch/$s_!kQZ-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6708a93e-3eb2-4ff9-a071-4aa09e63fe40_827x487.png 1272w, https://substackcdn.com/image/fetch/$s_!kQZ-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6708a93e-3eb2-4ff9-a071-4aa09e63fe40_827x487.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kQZ-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6708a93e-3eb2-4ff9-a071-4aa09e63fe40_827x487.png" width="827" height="487" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6708a93e-3eb2-4ff9-a071-4aa09e63fe40_827x487.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:487,&quot;width&quot;:827,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kQZ-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6708a93e-3eb2-4ff9-a071-4aa09e63fe40_827x487.png 424w, https://substackcdn.com/image/fetch/$s_!kQZ-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6708a93e-3eb2-4ff9-a071-4aa09e63fe40_827x487.png 848w, https://substackcdn.com/image/fetch/$s_!kQZ-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6708a93e-3eb2-4ff9-a071-4aa09e63fe40_827x487.png 1272w, https://substackcdn.com/image/fetch/$s_!kQZ-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6708a93e-3eb2-4ff9-a071-4aa09e63fe40_827x487.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The problem, of course, is that you really want to know the relationship between being Catholic and voting Democratic at either a precinct or individual level to actually make claims about precincts and individuals. Using the aggregate data, it seems that more Catholics mean more votes for Democrats, but their point is that may not be true at the precinct or individual level (put differently, the data don&#8217;t provide causal evidence of voting patterns).</p><p>For example, it is entirely possible to construct precinct-level data consistent with such aggregate data that shows almost any actual Catholic-Democratic relationship is possible in a given precinct. Consider precinct &#8220;C&#8221; below, where the marginal totals for Democrats (48%) and Catholics (40%) match the aggregate totals for all the precincts combined; it would be easy to imagine in this precinct that there were no Catholic-Democrats, or no Catholic-GOP voters, and yet the aggregate picture would be unaffected (98).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w1ne!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80098a6-779a-4f77-84f8-afaa46f050cb_981x954.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w1ne!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80098a6-779a-4f77-84f8-afaa46f050cb_981x954.png 424w, https://substackcdn.com/image/fetch/$s_!w1ne!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80098a6-779a-4f77-84f8-afaa46f050cb_981x954.png 848w, https://substackcdn.com/image/fetch/$s_!w1ne!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80098a6-779a-4f77-84f8-afaa46f050cb_981x954.png 1272w, https://substackcdn.com/image/fetch/$s_!w1ne!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80098a6-779a-4f77-84f8-afaa46f050cb_981x954.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w1ne!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80098a6-779a-4f77-84f8-afaa46f050cb_981x954.png" width="981" height="954" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a80098a6-779a-4f77-84f8-afaa46f050cb_981x954.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:954,&quot;width&quot;:981,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w1ne!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80098a6-779a-4f77-84f8-afaa46f050cb_981x954.png 424w, https://substackcdn.com/image/fetch/$s_!w1ne!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80098a6-779a-4f77-84f8-afaa46f050cb_981x954.png 848w, https://substackcdn.com/image/fetch/$s_!w1ne!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80098a6-779a-4f77-84f8-afaa46f050cb_981x954.png 1272w, https://substackcdn.com/image/fetch/$s_!w1ne!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80098a6-779a-4f77-84f8-afaa46f050cb_981x954.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Whatever linear relationship exists at the aggregate wouldn&#8217;t apply to that precinct or to the individuals within it.</em></p><p>Such examples might at first feel a bit like a mathematical trick or oddity, and in a sense, they are. But such fallacies also offer an important insight for historians, even ones who don&#8217;t study voting records. Historians love to handpick a couple examples, and then generalize from them to an aggregate. That opens them to the well-known criticism that the chosen examples are not representative. So historians increasingly are being held responsible to actually do the research about what is and is not representative. One solution, as decades of digital and computational history projects have reminded us, is to instead look across an aggregated set of data and see what relationships hold at that level. This has gotten unbelievably easier to do with expanded digitization efforts and computing power since the textbook was published in 1971. </p><p>Nearly anyone can search in huge databases for keywords or specific phrases; they can mine data from censuses and government documents; they can draw on decades of economic and financial reports.</p><p>The ecological fallacy is a reminder, however, that simply beginning at the aggregate and then making determinations about what happens at the individual level is perhaps more problematic. It is easy and even tempting to move between these levels, but without attention to principles of statistical reasoning, the results are likely to be invalid. </p><p>Dollar and Jensen realized this when they included both the methods of statistical analysis and those of computational and data processing in their textbook. In our age of widely available data sets and incredibly easy calculation engines, historians cannot ignore statistical thinking in their work. </p><p>We need both computational methods and statistical principles, just as we did in 1971.</p><div><hr></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/the-historians-guide-to-statistics?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">If you learned something in this post, please share and subscribe.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/the-historians-guide-to-statistics?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/the-historians-guide-to-statistics?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Python for Reading]]></title><description><![CDATA[or why we don't need to be computer scientists to code]]></description><link>https://computationalhistory.substack.com/p/python-for-reading</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/python-for-reading</guid><dc:creator><![CDATA[Louis Hyman]]></dc:creator><pubDate>Wed, 22 Apr 2026 15:13:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!k5PK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c63a20-9fb6-475e-953c-8a5af58773f6_1086x1448.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When I was in graduate school, history PhDs needed to pass two different language exams. Now, I am American, so you might imagine that I&#8217;m probably better in English than in other languages, and you would be right. I started taking German in seventh grade (because the Spanish teacher yelled at me in the hallway once in sixth grade). In all my other classes, I was a very good student, but in that class, I barely passed. I bumbled in this way through high school, basically being an excellent student except in German. Poor Herr Lyon-Vaiden. He was such a kind man.</p><p>When I went to college, I had to complete through the fourth semester of a foreign language. Guess how that went? I took the placement test and placed out of just one semester of German, and only because I begged. At this point, I was about six years in. So I took three more semesters of German, struggling all the way. When I got to graduate school, I took the placement test in German, figuring I could do this. I had a dictionary next to me!</p><p>I failed the test.</p><p>I swore that I would never spend another minute of my life studying German. So I picked up a book called <em>French for Reading</em>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!c9Ie!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dc7f8ac-dc7c-4413-9cb5-573813d28725_1920x1920.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!c9Ie!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dc7f8ac-dc7c-4413-9cb5-573813d28725_1920x1920.jpeg 424w, https://substackcdn.com/image/fetch/$s_!c9Ie!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dc7f8ac-dc7c-4413-9cb5-573813d28725_1920x1920.jpeg 848w, https://substackcdn.com/image/fetch/$s_!c9Ie!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dc7f8ac-dc7c-4413-9cb5-573813d28725_1920x1920.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!c9Ie!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dc7f8ac-dc7c-4413-9cb5-573813d28725_1920x1920.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!c9Ie!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dc7f8ac-dc7c-4413-9cb5-573813d28725_1920x1920.jpeg" width="374" height="374" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3dc7f8ac-dc7c-4413-9cb5-573813d28725_1920x1920.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:374,&quot;bytes&quot;:522651,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!c9Ie!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dc7f8ac-dc7c-4413-9cb5-573813d28725_1920x1920.jpeg 424w, https://substackcdn.com/image/fetch/$s_!c9Ie!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dc7f8ac-dc7c-4413-9cb5-573813d28725_1920x1920.jpeg 848w, https://substackcdn.com/image/fetch/$s_!c9Ie!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dc7f8ac-dc7c-4413-9cb5-573813d28725_1920x1920.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!c9Ie!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dc7f8ac-dc7c-4413-9cb5-573813d28725_1920x1920.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This book was unlike any text I&#8217;d ever seen in German. Instead of trying to get me to buy pants at the mall&#8212;which was something I had no interest in talking about in English&#8212;it started with ideas. I read about philosophy and art, science and history. I actually care about those things. <em>French for Reading</em> gave me new words to understand new ideas that did not exist in English. I was learning new ideas (not just learning how to shop in a different language).</p><p>In about six weeks studying this book, I took the French exam, and I passed. Now, I did pass the German test on my second try (but I swore I was going to learn Spanish before I learned any more German).</p><p>But the larger point here is that when you learn a new kind of language with a goal in mind rather than just doing what you ordinarily do, it can be actually quite exciting. It wasn&#8217;t that I was bad at German; it&#8217;s that I was bad at German for going shopping at the mall.</p><p>So I&#8217;m sure that many of you are anxious about programming. But this moment is not about programming&#8212;it&#8217;s about using <a href="https://en.wikipedia.org/wiki/Computer_code">code</a> to understand the past in new ways.</p><p>The other thing that was amazing about <em>French for Reading</em> was that it was just for reading. I don&#8217;t have a musical ear. I definitely don&#8217;t have a good ear for accents or languages. But somehow I could learn to read French in six weeks. I can&#8217;t write French. I can&#8217;t speak French. But if I go to a museum, I can, even decades later, read the wall labels.</p><p>So when I think about <a href="https://en.wikipedia.org/wiki/Artificial_intelligence">AI</a>, I think, <em>Python for Reading.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k5PK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c63a20-9fb6-475e-953c-8a5af58773f6_1086x1448.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k5PK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c63a20-9fb6-475e-953c-8a5af58773f6_1086x1448.png 424w, https://substackcdn.com/image/fetch/$s_!k5PK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c63a20-9fb6-475e-953c-8a5af58773f6_1086x1448.png 848w, https://substackcdn.com/image/fetch/$s_!k5PK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c63a20-9fb6-475e-953c-8a5af58773f6_1086x1448.png 1272w, https://substackcdn.com/image/fetch/$s_!k5PK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c63a20-9fb6-475e-953c-8a5af58773f6_1086x1448.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k5PK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c63a20-9fb6-475e-953c-8a5af58773f6_1086x1448.png" width="377" height="502.6666666666667" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d4c63a20-9fb6-475e-953c-8a5af58773f6_1086x1448.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1448,&quot;width&quot;:1086,&quot;resizeWidth&quot;:377,&quot;bytes&quot;:1989513,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/195043554?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c63a20-9fb6-475e-953c-8a5af58773f6_1086x1448.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!k5PK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c63a20-9fb6-475e-953c-8a5af58773f6_1086x1448.png 424w, https://substackcdn.com/image/fetch/$s_!k5PK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c63a20-9fb6-475e-953c-8a5af58773f6_1086x1448.png 848w, https://substackcdn.com/image/fetch/$s_!k5PK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c63a20-9fb6-475e-953c-8a5af58773f6_1086x1448.png 1272w, https://substackcdn.com/image/fetch/$s_!k5PK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c63a20-9fb6-475e-953c-8a5af58773f6_1086x1448.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A main objection to &#8220;<a href="https://en.wikipedia.org/wiki/Prompt_engineering#Vibe_coding">vibe coding</a>&#8220; is that the user doesn&#8217;t actually understand what is happening, which, of course, would be bad. The argument here is that if you don&#8217;t write the code, you can&#8217;t understand&#8212;or critique&#8212;the code. I don&#8217;t believe that for <a href="https://en.wikipedia.org/wiki/Python_(programming_language)">Python</a> anymore than I believe that for French. Do writers understand a language better than readers? Sure. Does that mean that readers have no understanding? No.</p><p>Python, in particular, is a very human-friendly language to read (unlike <a href="https://en.wikipedia.org/wiki/Assembly_language">assembly</a>). With a little training, a reader can understand what is happening in the code and think through its gaps. While I can write Python, my understanding of Python is far larger as a reader than a writer.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/python-for-reading?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/python-for-reading?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>More importantly, my understanding of what Python can do, and how it ought to be done, has little to do with whether I remember a particular bit of <a href="https://en.wikipedia.org/wiki/Syntax_(programming_languages)">syntax</a>. In the era of AI, those high-level ideas&#8212;<a href="https://en.wikipedia.org/wiki/Abstraction_(computer_science)">abstraction</a>, <a href="https://en.wikipedia.org/wiki/Encapsulation_(computer_programming)">encapsulation</a>, <a href="https://en.wikipedia.org/wiki/Design_by_contract">data contracts</a>, <a href="https://en.wikipedia.org/wiki/Software_brittleness">robustness</a>, <a href="https://en.wikipedia.org/wiki/Pipeline_(software)">pipelines</a>, etc.&#8212;matter much more (though you should know about <a href="https://en.wikipedia.org/wiki/Indentation_(typesetting)#In_programming">indentation</a>).</p><p>And those ideas are the equivalent of learning about French philosophy rather than German shopping. We all love ideas. That&#8217;s our thing. And if we lean into our curiosity, we can do amazing things with this new technology.</p><p>The goal is not to become a <a href="https://en.wikipedia.org/wiki/Computer_science">computer scientist</a>. The goal is to make novel, interesting, creative, curious, and human arguments about history. The computer will write the code to answer your questions.</p><p>Just as you don&#8217;t need to be a linguist to write, you don&#8217;t need to be a computer scientist to code. </p><p>After a few weeks, you will not write Python at all. You will write history.</p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Please consider subscribing and sharing.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/python-for-reading?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/python-for-reading?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Nodes, Edges, and the Historian’s Craft: Knowledge Graphs as Research Notes]]></title><description><![CDATA[How moving from Word docs to network graphs transformed my research workflow and how LLMs made it possible.]]></description><link>https://computationalhistory.substack.com/p/nodes-edges-and-the-historians-craft</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/nodes-edges-and-the-historians-craft</guid><pubDate>Thu, 16 Apr 2026 12:33:09 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!XpuG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fe5f8d-939a-47ab-9168-05550a07d8ee_840x404.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>The Old Barrier</h2><p>Building a <a href="https://en.wikipedia.org/wiki/Knowledge_graph">knowledge graph</a> used to require either serious technical infrastructure or a funded project with a team of developers. You needed a <a href="https://en.wikipedia.org/wiki/Triplestore">triple store</a> or a <a href="https://en.wikipedia.org/wiki/Graph_database">graph database</a>, a formal ontology, an ingest pipeline, and perhaps a <a href="https://en.wikipedia.org/wiki/SPARQL">SPARQL</a> endpoint. The barrier to entry was so high that knowledge graphs remained the province of large <a href="https://en.wikipedia.org/wiki/Digital_humanities">digital humanities</a> projects, compelling in principle but inaccessible for a researcher working alone on a specific archival question.</p><p>That has changed. Recently, I&#8217;ve been using Claude Code to build knowledge graphs not as final outputs, but as a daily method for taking research notes. It turns out that nodes and edges are a vastly superior format for historical research than anything I&#8217;ve used before.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XpuG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fe5f8d-939a-47ab-9168-05550a07d8ee_840x404.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XpuG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fe5f8d-939a-47ab-9168-05550a07d8ee_840x404.png 424w, https://substackcdn.com/image/fetch/$s_!XpuG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fe5f8d-939a-47ab-9168-05550a07d8ee_840x404.png 848w, https://substackcdn.com/image/fetch/$s_!XpuG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fe5f8d-939a-47ab-9168-05550a07d8ee_840x404.png 1272w, https://substackcdn.com/image/fetch/$s_!XpuG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fe5f8d-939a-47ab-9168-05550a07d8ee_840x404.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XpuG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fe5f8d-939a-47ab-9168-05550a07d8ee_840x404.png" width="840" height="404" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77fe5f8d-939a-47ab-9168-05550a07d8ee_840x404.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:404,&quot;width&quot;:840,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:200833,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/193599228?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fe5f8d-939a-47ab-9168-05550a07d8ee_840x404.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XpuG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fe5f8d-939a-47ab-9168-05550a07d8ee_840x404.png 424w, https://substackcdn.com/image/fetch/$s_!XpuG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fe5f8d-939a-47ab-9168-05550a07d8ee_840x404.png 848w, https://substackcdn.com/image/fetch/$s_!XpuG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fe5f8d-939a-47ab-9168-05550a07d8ee_840x404.png 1272w, https://substackcdn.com/image/fetch/$s_!XpuG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77fe5f8d-939a-47ab-9168-05550a07d8ee_840x404.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Dry Rot in Britain: Temporal Knowledge Graph &#8212; tracing the rot from Pepys (1684) through scientific classification to HMS Queen Charlotte (1812).</em></p><div><hr></div><h2>Why Nodes and Edges?</h2><p>Every historian has a system for managing the chaos of the archive. Word documents, Excel spreadsheets, index cards, <a href="https://en.wikipedia.org/wiki/Zotero">Zotero</a> libraries, folders of hastily snapped photographs. The fundamental problem is always the same: as the material accumulates, the connections between sources begin to fray. You know you read something about a specific merchant six months ago, but where? You know two events in different colonies are related, but the evidence is scattered across three different files.</p><p>Historians have always been interested in relationships, the problem is that our note-taking systems have never matched that interest. A knowledge graph solves this by recording the links between entities as rigorously as the entities themselves. Instead of burying the connection in a flat sentence that says, &#8220;Kyd sent Banks information about myrobalans from Calcutta,&#8221; you create nodes (Kyd, Banks, myrobalans, Calcutta) and define the edges between them: sent to, located in, mentioned in. The note and the underlying structure become the exact same thing.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5-ZP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16036104-193a-4c76-aa93-c1da739ff891_840x408.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5-ZP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16036104-193a-4c76-aa93-c1da739ff891_840x408.png 424w, https://substackcdn.com/image/fetch/$s_!5-ZP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16036104-193a-4c76-aa93-c1da739ff891_840x408.png 848w, https://substackcdn.com/image/fetch/$s_!5-ZP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16036104-193a-4c76-aa93-c1da739ff891_840x408.png 1272w, https://substackcdn.com/image/fetch/$s_!5-ZP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16036104-193a-4c76-aa93-c1da739ff891_840x408.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5-ZP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16036104-193a-4c76-aa93-c1da739ff891_840x408.png" width="840" height="408" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/16036104-193a-4c76-aa93-c1da739ff891_840x408.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:408,&quot;width&quot;:840,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:62948,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/193599228?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16036104-193a-4c76-aa93-c1da739ff891_840x408.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5-ZP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16036104-193a-4c76-aa93-c1da739ff891_840x408.png 424w, https://substackcdn.com/image/fetch/$s_!5-ZP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16036104-193a-4c76-aa93-c1da739ff891_840x408.png 848w, https://substackcdn.com/image/fetch/$s_!5-ZP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16036104-193a-4c76-aa93-c1da739ff891_840x408.png 1272w, https://substackcdn.com/image/fetch/$s_!5-ZP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16036104-193a-4c76-aa93-c1da739ff891_840x408.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>A single sentence, &#8220;Kyd sent Banks information about myrobalans from Calcutta,&#8221; becomes a structured network of entities and relationships.</em></p><p></p><p>If you&#8217;ve ever used <a href="https://en.wikipedia.org/wiki/Gephi">Gephi</a> or sketched a network diagram on a whiteboard, you already understand the concept. Nodes are entities: people, places, commodities, institutions, sources, events. Edges are the relationships binding them together.</p><p>The shift is simple but profound. In a Word document, information is organized chronologically by when you found it. In a spreadsheet, it is constrained by whatever columns you guessed you might need at the start. In a knowledge graph, information is organized by what it is actually about. Every new note automatically connects to everything you already know about those same people, places, and commodity chains.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/nodes-edges-and-the-historians-craft?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/nodes-edges-and-the-historians-craft?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p><div><hr></div><h2>Small Scale: Dry Rot and Traditional Research</h2><p>I&#8217;ve been using this method to investigate when <em>Serpula lacrymans</em>, the destructive dry rot fungus, arrived in Britain. There is no bulk data to process here. This is entirely traditional research: searching through digitized eighteenth-century sources, following citation trails, cross-referencing dates. The kind of work where you are deep in <a href="https://en.wikipedia.org/wiki/Eighteenth_Century_Collections_Online">Eighteenth Century Collections Online</a> and <a href="https://en.wikipedia.org/wiki/Google_Books">Google Books</a>, chasing footnotes between building manuals, parliamentary records, botanical surveys, and newspaper archives.</p><p>Now, every time I find a new source, I don&#8217;t add a line to a spreadsheet. I add nodes and edges. The graph has 43 people, 62 sources, 25 events, 2 organisms, 60 relationships, and 15 open questions, all in a single <a href="https://en.wikipedia.org/wiki/JSON">JSON</a> file. Claude Code handles the data entry. I describe what I&#8217;ve found in plain English, and it maintains the structured knowledge graph and an interactive timeline visualization.</p><p>The graph forces distinctions that flat notes let you fudge. Samuel Pepys gathered toadstools &#8220;as big as my Fists&#8221; from neglected ship holds in 1684, and that&#8217;s often cited as early evidence of dry rot. But when I added Pepys as a source node and connected it to the organism nodes, the graph forced a question: which organism? The conditions Pepys describes are textbook for native wet rot, not the invasive <em>Serpula</em>. Because the graph holds both organisms as separate nodes with distinct diagnostic features, that distinction stays visible every time a new source mentions &#8220;rot.&#8221;</p><p>The graph also holds what I don&#8217;t know. Those 15 open question nodes (missing volumes, unverified claims, sources I haven&#8217;t yet read) are as important as the evidence nodes. In a Word document, unanswered questions get buried between paragraphs. In a graph, they stay visible, connected to the sources that raised them, waiting.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p><p></p><div><hr></div><h2>Medium Scale: The Banks Leather Network</h2><p>The dry rot graph is small and growing, but the Joseph Banks leather project demonstrates what happens when this method scales up and merges with machine transcription.</p><p>It started with a trip to the Sutro Library in San Francisco, where I photographed 181 handwritten letters related to the British leather trade (1797&#8211;1817). I used Claude Code to write processing scripts, which sent the photographs to Google&#8217;s Gemini for handwriting recognition, extracted structured entities from the transcriptions, and output the results as nodes and edges. I then read the key letters in chronological order to confirm the Gemini transcriptions and better understand the knowledge transmission between India and London.</p><p>I then integrated the full correspondence network from Warren Dawson&#8217;s <em>Calendar of the Banks Letters</em>, nearly 7,000 entries, and pulled in select figures related to leather and India from Neil Chambers&#8217; <em>Indian and Pacific Correspondence</em>. But the core of the work was still note-taking. I was trying to answer a specific question: How did Banks identify catechu as a viable tanning agent?</p><p>The resulting leather network contains 275 people, 55 commodities, 119 places, and 46 institutions (495 nodes and 1,191 connections). That is a modest dataset, but it revealed global connections that sequential reading simply couldn&#8217;t.</p><p>The graph made it visually apparent that Banks operated as a switchboard connecting Indian botanical research directly to British industrial policy, routing knowledge between people who would never otherwise have crossed paths. Robert Kyd at the Calcutta Botanic Garden, Charles Jenkinson in the House of Lords, Samuel Purkis the tanner, and Andrew Berry conducting tanning experiments. The graph reveals them as vital components of a single coordinated commodity network, with Banks at the center.</p><p>Working through the graph drove new biographical research. It surfaced surprises, like the fact that Samuel Purkis and Humphry Davy were friends before Davy began working with Banks, and that Purkis had been corresponding with Banks throughout the 1790s. The graph didn&#8217;t explain the connection, but it surfaced it, turning a static name on a letter into a vibrant research question.</p><div><hr></div><h2>Visualizations as Research Tools</h2><p>Anyone who has attended a digital humanities conference knows the problem with network graphs: they often look like impenetrable hairballs. Generated by <a href="https://en.wikipedia.org/wiki/NetworkX">NetworkX</a> or Gephi with default settings, they are technically correct but analytically useless. You can&#8217;t tell what the graph is arguing because the layout isn&#8217;t designed to argue anything, it&#8217;s just a physics simulation.</p><p>Claude Code built the visualizations for the leather project too, and the one I&#8217;m most excited about is a temporal network.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BEJm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb690229-756e-4c6c-b97b-7a5c73872582_840x400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BEJm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb690229-756e-4c6c-b97b-7a5c73872582_840x400.png 424w, https://substackcdn.com/image/fetch/$s_!BEJm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb690229-756e-4c6c-b97b-7a5c73872582_840x400.png 848w, https://substackcdn.com/image/fetch/$s_!BEJm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb690229-756e-4c6c-b97b-7a5c73872582_840x400.png 1272w, https://substackcdn.com/image/fetch/$s_!BEJm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb690229-756e-4c6c-b97b-7a5c73872582_840x400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BEJm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb690229-756e-4c6c-b97b-7a5c73872582_840x400.png" width="840" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db690229-756e-4c6c-b97b-7a5c73872582_840x400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:840,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:181875,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/193599228?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb690229-756e-4c6c-b97b-7a5c73872582_840x400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BEJm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb690229-756e-4c6c-b97b-7a5c73872582_840x400.png 424w, https://substackcdn.com/image/fetch/$s_!BEJm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb690229-756e-4c6c-b97b-7a5c73872582_840x400.png 848w, https://substackcdn.com/image/fetch/$s_!BEJm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb690229-756e-4c6c-b97b-7a5c73872582_840x400.png 1272w, https://substackcdn.com/image/fetch/$s_!BEJm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb690229-756e-4c6c-b97b-7a5c73872582_840x400.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Leather Crisis: Temporal Network, years anchored along the top, people pulled by correspondence weight.</em></p><p>Standard tools force you to choose between a timeline or a network graph. The temporal network fuses them. Years are fixed along the x-axis, and people float below, gravitationally pulled toward the years when they were most active. You can actually watch the network grow as the tanning crisis develops, spotting the transition as the conversation shifts from botanical science to industrial policy. This hybrid layout didn&#8217;t exist in a dropdown menu. It emerged from describing what I wanted to see and letting the AI implement it.</p><p>Because Claude Code builds through natural conversation, each visualization matches the specific historical question. Before this, custom visualizations were simply out of reach for most historians; you either used what Gephi, <a href="https://en.wikipedia.org/wiki/QGIS">QGIS</a> or <a href="https://en.wikipedia.org/wiki/Tableau_Software">Tableau</a> offered or you didn&#8217;t visualize at all. Now, iterative, question-driven visualization is just another part of the research workflow.</p><div><hr></div><h2>Large Scale: From Personal Graphs to Linked Open Data</h2><p>The dry rot and leather graphs are personal research tools: my nodes, my edges, my questions. But this exact logic scales up to major <a href="https://en.wikipedia.org/wiki/Linked_data">linked open data</a> projects.</p><p>Take LINCS (Linked Infrastructure for Networked Cultural Scholarship), a project building linked open data infrastructure for cultural heritage in Canada. Our Historical Canadians dataset builds on the Dictionary of Canadian Biography with <a href="https://en.wikipedia.org/wiki/Wikidata">Wikidata</a>, modeling familial connections, occupations, and residences using <a href="https://en.wikipedia.org/wiki/CIDOC_Conceptual_Reference_Model">CIDOC-CRM</a>, all queryable via SPARQL.</p><p>That is the far end of the spectrum: formal ontologies, institutional infrastructure, and strict interoperability. My personal graphs are not <a href="https://en.wikipedia.org/wiki/FAIR_data">FAIR data</a> (not Findable, Accessible, Interoperable, or Reusable in the formal sense). I can and do share them, but they are not interoperable. Making them so would mean modeling every entity against a formal ontology like CIDOC-CRM, reconciling every node to a persistent identifier, and minting <a href="https://en.wikipedia.org/wiki/Uniform_Resource_Identifier">URI</a>s when the PIDs do not exist, a significant step up from a JSON file and a conversation with Claude. The underlying logic, entities and relationships, is the same, but the distance between a personal research graph and a LOD dataset is real.</p><p>That said, we are actively experimenting with using LLMs to reduce this barrier, and the results are promising. But today, there is no easy path from a Claude subscription to standards-compliant linked data. The habits of structured thinking you develop when building a small graph are exactly the habits required to engage with large-scale projects like LINCS, but the tooling to bridge that gap is still emerging.</p><div><hr></div><h2>Try It</h2><p>The barrier to entry has collapsed. You don&#8217;t need a massive grant or a dedicated developer to start. You just need a compelling historical question, a willingness to think in nodes and edges, and the right AI tools to help you build. If you&#8217;re a historian still managing the chaos of the archive in Word documents and spreadsheets, consider nodes and edges instead. Start small. The graph will grow with your research, and you might be surprised by the connections it reveals.</p><div><hr></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/nodes-edges-and-the-historians-craft?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Enjoy the article? Share with one friend, or even better, your socials. </p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/nodes-edges-and-the-historians-craft?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/nodes-edges-and-the-historians-craft?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p></p>]]></content:encoded></item><item><title><![CDATA[On having faith in your students]]></title><description><![CDATA[Or, why I trust them to make the right choice about AI]]></description><link>https://computationalhistory.substack.com/p/on-having-faith-in-your-students</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/on-having-faith-in-your-students</guid><dc:creator><![CDATA[Louis Hyman]]></dc:creator><pubDate>Wed, 15 Apr 2026 19:31:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Qigr!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187b411b-680b-41a2-b0f7-e0ea6f262a55_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I am often asked by colleagues, who view AI as a plagiarism machine, how I handle its use in the classroom. </p><p>In my <em>Data Methods &amp; History</em> class, the answer is that I require students to use AI for coding. They simply could not get up to speed fast enough to do cool stuff without leaning on its support. We get to think about high-level questions about the origins of data, mathematical reasoning, and storytelling without worrying about code syntax. It is much more fun.</p><p>Yet, I also teach normal classes like my <em>Social Theories of the Economy</em>, which is a year-long sequence from Malthus to now. It is a course that would have been legible ten years ago as a standard-issue economic thought class. And in that class, we still assign essays.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p><p></p><p>Many of my colleagues have given up on essays. I get it. It is frustrating to have to police. But at least in my 15 person seminar, I trust my students. I trust them because I could see the fear in their eyes on the first day when I told them that they had a choice: either learn to read, write, and think better than a machine, or have no economic value&#8212;outside of organ donation. My class, in a very real way, is about the accelerando, the growth curve of capitalism, and how that happened. I tell them that they are living through another moment in that story. And it is scary. </p><p>Nobody talks to them like this. AI is the center of their fears for the future and nobody tells them that their fears are both valid, and can be overcome. I do this. You can too. Because we know, as historians, that people have been told for centuries that they are replaceable and worthless&#8212;and that that isn&#8217;t true. It was not true when the mechanical thresher liberated us from the harvest. It was not true when the power loom took over our weaving. It was not true when the robot did our spot welding. Humans, it turns out, have capacities that only come out once the machines take over tasks. </p><p>Were these moments peaceful and the transitions just? Absolutely not. But I think we also have an opportunity here to learn from the past and consider our path forward as a society. Will that happen? Probably not. People will people.</p><p>Our students, however, do have choices. They can choose to trust you when you tell them that they can learn. My students have trusted me this year and despite what you hear, they all read hundreds of tough pages of <em>Economy and Society</em>, <em>Capital Volume 3</em>, and <em>Capitalism and Freedom</em>, and then they wrote their essays. That said, we also did oral exams, so I do know that they know their stuff.</p><p>So, as you come to the end of your semester, I would encourage you to think about why you got into this business. I cannot imagine anyone became a teacher because they love grading and finding plagiarism. It seems like the easy way out is to jettison the essay, but then students don&#8217;t learn how to make arguments. It is easier than policing AI. Yet there is an even easier way: to restore trust by getting real with the students. By speaking to their fears, you are also speaking to their hopes, our hopes for a different future.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/on-having-faith-in-your-students?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/on-having-faith-in-your-students?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p><p></p><p><strong>The actual email I sent my students this afternoon:</strong></p><div><hr></div><p></p><p>Class,</p><p>Please find three terrible essays that ChatGPT believes, in its wisdom, are &#8220;A&#8221; papers.</p><p>I would encourage you to contemplate the wreckage of your lives if you turned in something like these essays.</p><p>First, let&#8217;s not be too negative. There are upsides. The spelling is top-notch. The grammar is correct. The questions are actually not terrible, terrible. For a computer, I am totally impressed.</p><p>So, professor, why are they so bad?</p><p>1) ARROGANCE: None of these papers actually proves their points in a grounded, textual fashion. Do you see well-integrated quotes? Do you see footnotes to the text? Do you see close readings of key terms? [no] Is there a sophisticated argument spanning ten pages? No. Not at all. There is no complexity here. It is reduction.</p><p>2) MATH: The three essays, in total, are about 2000 words. Yes. So if someone were able to divide 2000 by 3, which no human ever has, we would know how much smaller these are.</p><p>3) SURFACE: These essays are all superficial. They are excellent summaries of key ideas, e.g., the double movement, but don&#8217;t really engage the text. They are dressed up notecards.</p><p>In the pre-ChatGPT era, I might have read this and thought &#8220;ok, those sentences are coherent. The student doesn&#8217;t really make much of the texts. It is weird that there are no quotes.&#8221; But now I think, &#8220;clearly an AI did this. I hope that student has other skills than thinking or writing or reading because that is the precarious future awaiting.&#8221;</p><p>Is it worthwhile talking about some of the readings with an AI? For sure. Talk with an AI. Talk with anybody. But don&#8217;t rely on AI or someone else to write your paper for you.</p><p>If, as you read them, you do not see these deficiencies then ask a friend. Or come to one of the teaching staff. I assure you, however, that these are terrible.</p><p>Best regards,</p><p>Prof. Hyman</p><p></p><p>PS: You are thinking, &#8220;but it offers to add footnotes!&#8221; Uh huh. You think that would turn out well?</p><p>PPS: &#8220;What if I iterated on this?&#8221; Nope. It would still be terrible. And even more choppy. AIs are not capable of sustained concentration.</p><p>PPPS: &#8220;But maybe I will just use it to read over my paper for grammar?&#8221; Yeah do that. I do that all the time. That&#8217;s fine. It is a super spell checker. Just don&#8217;t let it overwrite your voice with a million em-dashes or the word &#8220;delve&#8221; or some other fancy talk. Treat it like a friend. You would let a friend read your paper, tell you where you messed up a conjugation, or where something made no sense. That&#8217;s fine. But then you still write it. Do that. I know you can.</p><p>[PROMPTED]</p><p>-----</p><p>Got it, that constraint actually makes these much better, and much harder in the right way.</p><p>Below are <strong>three full essays (&#8776;2100&#8211;2400 words each)</strong> using <strong>only second-semester materials</strong> from your syllabus. Each one:</p><ul><li><p>frames a <strong>clear, non-obvious question</strong></p></li><li><p>centers a <strong>tension</strong></p></li><li><p>compares <strong>multiple thinkers across the semester</strong></p></li><li><p>stays <strong>internalist</strong> (ideas arguing with ideas)</p></li><li><p>avoids drifting back into Smith/Ricardo/Bentham/Mill</p></li></ul><div><hr></div><p><strong>ESSAY 1</strong></p><p><strong>Can Markets Create Social Order Without Destroying It? From Weber to Polanyi to Hayek</strong></p><p><strong>Question</strong></p><p>Can market societies generate stable social order on their own, or do they inevitably undermine the very social structures they depend on?</p><div><hr></div><p><strong>Introduction</strong></p><p>One of the central promises of modern economic thought is that markets can coordinate human activity without centralized control. From prices to competition, the market appears as a self-regulating system capable of producing order out of individual action. Yet across the second semester readings, this promise is repeatedly questioned. Markets generate coordination, but they also appear to dissolve the social bonds and norms that make coordination possible in the first place.</p><p>This essay argues that Weber, Polanyi, and Hayek articulate competing but deeply related visions of the relationship between markets and social order. Weber shows that markets depend on forms of rationalization and impersonal authority that reshape social life. Polanyi argues that marketization disembeds economic activity from social relations, producing instability that requires corrective intervention. Hayek, by contrast, defends the market as a superior information-processing system but must implicitly rely on social norms he cannot fully theorize. Taken together, these thinkers reveal a central tension: markets produce order, but only by transforming, and often eroding, the social foundations on which that order rests.</p><div><hr></div><p><strong>Weber: Markets as Rationalization and Impersonal Order</strong></p><p>Weber provides the starting point by redefining what &#8220;order&#8221; means in a market society. In premodern contexts, economic activity is embedded in tradition, status, and social obligation. Exchange is governed by relationships, not abstract rules. The rise of markets transforms this structure. Economic action becomes oriented toward calculation, efficiency, and profit.</p><p>What is striking in Weber is that the market does not eliminate order, it replaces one kind of order with another. The &#8220;impersonality&#8221; of the market is not a lack of structure, but a new form of structure. Individuals interact not as members of communities, but as participants in a system governed by prices and contracts. This shift is reinforced by the rise of bureaucracy, which Weber describes as the most rational form of organization. Bureaucratic authority replaces personal authority, making economic life predictable and calculable.</p><p>Yet this transformation comes at a cost. Weber&#8217;s famous &#8220;iron cage&#8221; is not simply about alienation, it is about the narrowing of social life. Rationalization produces efficiency, but it also strips away meaning and reduces human relationships to instrumental interactions. The market creates order, but it is a thin form of order, one that depends on formal rules rather than substantive social ties.</p><p>Weber does not argue that markets destroy society, but he does show that they fundamentally reshape it. The question becomes whether this new form of order is sustainable, or whether it generates tensions that cannot be resolved within the system itself.</p><div><hr></div><p><strong>Polanyi: The Disembedding of the Economy</strong></p><p>Polanyi radicalizes Weber&#8217;s insight by arguing that the market does not simply transform social relations, it attempts to separate itself from them entirely. In <em>The Great Transformation</em>, Polanyi describes the rise of the &#8220;self-regulating market&#8221; as a historically unprecedented development. For most of human history, economic activity was embedded in social institutions. The attempt to create a market system governed solely by price signals represents a dramatic rupture.</p><p>The key mechanism of this rupture is the commodification of land, labor, and money. These are not ordinary commodities, they are what Polanyi calls &#8220;fictitious commodities.&#8221; Labor is human life, land is nature, and money is a social relation. Treating them as commodities subjects them to market forces in ways that destabilize society.</p><p>Polanyi&#8217;s central claim is that this disembedding produces a &#8220;double movement.&#8221; On the one hand, markets expand and attempt to govern more areas of life. On the other hand, society pushes back, creating protective institutions to shield itself from the destructive effects of market forces. This dynamic suggests that markets cannot sustain themselves without intervention. Far from being self-regulating, they require constant political and social correction.</p><p>What is crucial here is that Polanyi reframes the problem. The issue is not whether markets can produce order, they clearly can. The issue is whether that order is compatible with social stability. For Polanyi, the answer is no. The logic of the market undermines the very conditions that make social life possible, forcing society to intervene.</p><div><hr></div><p><strong>Hayek: Markets as Knowledge Systems</strong></p><p>Hayek offers the most powerful defense of the market against these critiques. In &#8220;The Use of Knowledge in Society,&#8221; he argues that markets are not just mechanisms of exchange, but systems for processing dispersed information. No central authority can possess the knowledge required to allocate resources efficiently. Prices function as signals that coordinate individual actions without requiring anyone to understand the system as a whole.</p><p>From this perspective, the market&#8217;s impersonality is a strength, not a weakness. It allows coordination among strangers without requiring shared norms or values. Order emerges spontaneously from individual actions, guided by price signals.</p><p>Hayek&#8217;s argument directly challenges Polanyi. If markets are superior information systems, then attempts to regulate them may disrupt the very processes that make them effective. The &#8220;double movement&#8221; becomes, in Hayek&#8217;s view, a source of inefficiency and distortion.</p><p>Yet Hayek&#8217;s position contains a tension of its own. While he emphasizes the role of prices in coordinating behavior, he also acknowledges the importance of norms, rules, and institutions. Markets require a framework of law and social trust to function. Contracts must be enforced, property rights must be protected, and individuals must adhere to certain expectations of behavior.</p><p>What Hayek cannot fully explain is how these norms are maintained. If markets dissolve traditional social structures, where do the norms that sustain them come from? In defending the market, Hayek implicitly relies on social conditions that his theory does not account for.</p><div><hr></div><p><strong>The Tension: Order Through Disruption</strong></p><p>Taken together, Weber, Polanyi, and Hayek reveal a shared problem from different angles. Markets clearly produce a form of order. They coordinate activity, allocate resources, and enable complex systems of exchange. But they do so by transforming the social world.</p><p>Weber shows that markets replace traditional forms of social order with rationalized, impersonal structures. Polanyi argues that this transformation goes too far, disembedding economic activity from social life and generating instability. Hayek defends the market&#8217;s coordinating function but cannot fully account for the social norms that make it possible.</p><p>The tension can be summarized as follows: markets create order by disrupting existing forms of social organization. The question is whether this disruption can be contained, or whether it inevitably leads to instability.</p><div><hr></div><p><strong>Conclusion</strong></p><p>The debate over markets and social order is not a simple disagreement about efficiency or fairness. It is a deeper question about the nature of social life. Weber, Polanyi, and Hayek all recognize that markets are powerful organizing forces, but they differ on whether that power is ultimately constructive or destructive.</p><p>What emerges from this comparison is not a clear answer, but a clearer understanding of the stakes. Markets are not neutral mechanisms, they are transformative institutions. They create order, but they also reshape the conditions under which order is possible. The challenge, then, is not simply to choose between market and state, but to understand how economic systems interact with the social world they inhabit.</p><div><hr></div><p><strong>ESSAY 2</strong></p><p><strong>Is Rational Choice Enough? From Buchanan to Sen to Foucault</strong></p><p><strong>Question</strong></p><p>Can human behavior and social order be adequately explained through rational choice, or does this framework fundamentally misunderstand human action?</p><div><hr></div><p><strong>Introduction</strong></p><p>Modern economic theory often begins with a simple assumption: individuals act rationally to maximize their interests. This framework underlies much of contemporary political economy, from models of markets to theories of governance. Yet across the second semester readings, this assumption is repeatedly challenged.</p><p>This essay argues that Buchanan, Sen, and Foucault represent three distinct responses to the limits of rational choice theory. Buchanan formalizes rational choice as a foundation for political order. Sen critiques its reduction of human motivation to self-interest. Foucault reconceptualizes rationality itself as a product of power and governance. Together, they reveal that rational choice is not simply a neutral analytical tool, but a contested framework with profound implications for how we understand human behavior.</p><div><hr></div><p><strong>Buchanan: Rational Choice as Foundation</strong></p><p>Buchanan and Tullock&#8217;s <em>Calculus of Consent</em> extends rational choice into the realm of politics. Individuals are assumed to act in their self-interest, and political institutions are designed to aggregate these preferences. The goal is to create rules that allow individuals to cooperate while minimizing the costs of collective decision-making.</p><p>What is striking about Buchanan&#8217;s approach is its consistency. The same logic that explains market behavior is applied to political behavior. There is no need to appeal to moral commitments or social norms, individuals pursue their interests, and institutions channel these pursuits into stable outcomes.</p><p>This framework has considerable explanatory power. It provides a clear model of how cooperation can emerge from self-interested behavior. But it also relies on a narrow conception of human motivation. Individuals are treated as utility maximizers, and other aspects of human behavior are either ignored or subsumed under this model.</p><div><hr></div><p><strong>Sen: The Critique of &#8220;Rational Fools&#8221;</strong></p><p>Sen&#8217;s critique targets this narrow conception directly. In &#8220;Rational Fools,&#8221; he argues that reducing human behavior to utility maximization ignores important dimensions of human life, including commitment, identity, and moral reasoning.</p><p>Sen&#8217;s key insight is that individuals often act in ways that cannot be explained by self-interest alone. They may sacrifice personal gain for ethical reasons, adhere to social norms, or act out of a sense of obligation. These actions are not irrational, they reflect a broader conception of rationality.</p><p>By expanding the concept of rationality, Sen challenges the foundation of rational choice theory. If individuals are not simply utility maximizers, then models based on this assumption may fail to capture important aspects of behavior. The problem is not just empirical, it is conceptual. Rational choice theory defines rationality in a way that excludes many forms of human action.</p><div><hr></div><p><strong>Foucault: Rationality as Governance</strong></p><p>Foucault takes the critique further by questioning the very notion of rationality. In <em>The Birth of Biopolitics</em>, he argues that economic rationality is not a natural feature of human behavior, but a product of specific forms of governance.</p><p>For Foucault, neoliberalism does not simply describe how individuals behave, it shapes how they understand themselves. Individuals are encouraged to see themselves as entrepreneurs, constantly optimizing their choices. Rationality becomes a norm imposed by institutions and practices.</p><p>This perspective transforms the debate. The question is no longer whether individuals are rational, but how rationality is defined and enforced. Rational choice theory is not just an analytical tool, it is part of a broader system of power that shapes behavior.</p><div><hr></div><p><strong>Conclusion</strong></p><p>The debate over rational choice is not just about the accuracy of a model, it is about the nature of human action. Buchanan, Sen, and Foucault reveal that rationality is not a fixed concept, but a contested one. Understanding its limits requires moving beyond simple assumptions and engaging with the complexity of human behavior.</p><div><hr></div><p><strong>ESSAY 3</strong></p><p><strong>Does Capitalism Require Inequality? From Du Bois to Polanyi to Robinson</strong></p><p><strong>Question</strong></p><p>Is inequality an accidental feature of capitalism, or is it structurally necessary to its operation?</p><div><hr></div><p><strong>Introduction</strong></p><p>Capitalism is often justified as a system that promotes efficiency and growth. Inequality is sometimes seen as an unfortunate byproduct, but not a defining feature. Yet across the second semester readings, inequality appears not as an accident, but as a structural component of the system.</p><p>This essay argues that Du Bois, Polanyi, and Cedric Robinson each demonstrate that inequality is deeply embedded in capitalist development. Whether through race, class, or global hierarchy, capitalism relies on forms of differentiation that enable accumulation. Inequality is not a deviation from the system, it is one of its conditions of possibility.</p><div><hr></div><p><strong>Du Bois: Race and Labor</strong></p><p>Du Bois&#8217;s analysis of Reconstruction reveals how racial divisions were used to structure labor markets. The &#8220;wages of whiteness&#8221; provided psychological and social benefits to white workers, even as they remained economically exploited.</p><p>This dynamic prevented the formation of class solidarity and allowed capital to maintain control. Inequality was not simply economic, it was racialized, creating divisions that sustained the system.</p><div><hr></div><p><strong>Polanyi: Market Expansion and Social Disruption</strong></p><p>Polanyi&#8217;s account of market expansion highlights how inequality emerges from the commodification of labor and land. As markets expand, they create disparities that destabilize society.</p><p>These disparities are not accidental, they result from the logic of the market itself. The attempt to treat labor as a commodity generates inequalities that must be managed through social and political intervention.</p><div><hr></div><p><strong>Robinson: Racial Capitalism</strong></p><p>Robinson extends this analysis by arguing that capitalism has always been racial. The system did not emerge in a neutral context, it developed through processes of conquest, slavery, and colonialism.</p><p>Inequality is therefore not a byproduct, it is foundational. Capitalism depends on the exploitation of differentiated populations, and these differences are often constructed along racial lines.</p><div><hr></div><p><strong>Conclusion</strong></p><p>The question of inequality is not external to capitalism, it is internal. Du Bois, Polanyi, and Robinson show that inequality is not simply a problem to be solved, it is a feature of the system itself. Understanding capitalism requires confronting this reality.</p><div><hr></div><p>If you want, I can:</p><ul><li><p>Make one of these intentionally <strong>&#8220;B+&#8221; instead of A</strong> for contrast</p></li><li><p>Add <strong>footnotes/citations in Chicago style</strong></p></li><li><p>Or write a <strong>fourth essay using environment/climate (Beck/Klein/Lomborg)</strong> which would be very strong pedagogically</p></li></ul><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/on-having-faith-in-your-students?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/on-having-faith-in-your-students?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[AI needs historians as much as historians need AI]]></title><description><![CDATA[Digital history did not fail in isolation&#8212;it failed just before the world discovered how much it needed it.]]></description><link>https://computationalhistory.substack.com/p/ai-needs-historians-as-much-as-historians</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/ai-needs-historians-as-much-as-historians</guid><dc:creator><![CDATA[Jo Guldi]]></dc:creator><pubDate>Thu, 09 Apr 2026 12:52:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Af3H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166c0c01-187e-4f38-bfa1-ef0a102f69c1_1002x676.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Digital history did not fail in isolation&#8212;it failed just before the world discovered how much it needed it. As artificial intelligence systems attempt to interpret vast bodies of text, they confront problems of evidence, disagreement, and representation that historians have spent centuries learning to navigate. The future of AI may depend not on replacing the humanities, but on rediscovering what they know.</p><p>But here&#8217;s the problem. At the very moment when artificial intelligence is reorganizing the economy and the university, its central challenges are, in fact, historical ways of knowing. In computer science, tasks such as retrieval, benchmarking, and the establishment of ground truth all depend on the selection, comparison, and evaluation of documents within large corpora. Determining whether outputs correspond to reliable bodies of evidence is not a purely technical question. It is an interpretive one.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Computational History! Subscribe for free to receive new posts and support the work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Historians have long developed methods for addressing precisely these challenges; it was these approaches that I review in the opening chapters of <em>The Dangerous Art of Text Mining, </em>showing how information scientists who attempt large-scale analysis without historians&#8217; understandings of the bias of archives and algorithms has led to retracted journal articles and unsupportable findings. I argued that historians&#8217; methods, paired with NLP strategies, could make a &#8220;smarter&#8221; data science, which matched inferences, algorithms, and sources .</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Af3H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166c0c01-187e-4f38-bfa1-ef0a102f69c1_1002x676.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Af3H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166c0c01-187e-4f38-bfa1-ef0a102f69c1_1002x676.png 424w, https://substackcdn.com/image/fetch/$s_!Af3H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166c0c01-187e-4f38-bfa1-ef0a102f69c1_1002x676.png 848w, https://substackcdn.com/image/fetch/$s_!Af3H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166c0c01-187e-4f38-bfa1-ef0a102f69c1_1002x676.png 1272w, https://substackcdn.com/image/fetch/$s_!Af3H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166c0c01-187e-4f38-bfa1-ef0a102f69c1_1002x676.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Af3H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166c0c01-187e-4f38-bfa1-ef0a102f69c1_1002x676.png" width="1002" height="676" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/166c0c01-187e-4f38-bfa1-ef0a102f69c1_1002x676.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:676,&quot;width&quot;:1002,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1327259,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/192767444?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166c0c01-187e-4f38-bfa1-ef0a102f69c1_1002x676.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Af3H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166c0c01-187e-4f38-bfa1-ef0a102f69c1_1002x676.png 424w, https://substackcdn.com/image/fetch/$s_!Af3H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166c0c01-187e-4f38-bfa1-ef0a102f69c1_1002x676.png 848w, https://substackcdn.com/image/fetch/$s_!Af3H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166c0c01-187e-4f38-bfa1-ef0a102f69c1_1002x676.png 1272w, https://substackcdn.com/image/fetch/$s_!Af3H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F166c0c01-187e-4f38-bfa1-ef0a102f69c1_1002x676.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Today, a growing body of work by historians shows that historians are on the forefront of the task of retooling AI to improve its retrieval mechanisms, confronting messy sources on historians&#8217; terms. In <a href="https://doi.org/10.1080/01615440.2025.2512744">&#8220;Data retrieval from local heritage books&#8212;Is artificial intelligence the solution?&#8221;</a>, Robert Stelter and Rafael Biehler test LLM-based extraction against code-based and manual methods and show both the promise and the limits of AI when working with irregular historical books, reminding us that retrieval from the archive is never just a technical matter but a problem of source structure, error, and judgment. In Stewart Spencer Dean and Sanskriti Sinha&#8217;s <a href="https://doi.org/10.1017/chr.2025.10019">&#8220;Retrieving information from unstructured historical sources using large language models&#8221;</a>, the authors show that LLMs designed to mirror historians&#8217; workflows can extract structured information from difficult historical documents more flexibly than older pipelines, while also underscoring how fragile reproducibility and transparency become when retrieval depends on opaque model behavior. In <a href="https://journalofdigitalhistory.org/en/article/JZx9gw7iwGxb">&#8220;Mapping the Latent Past: Assessing Large Language Models as Digital Tools through Source Criticism&#8221;</a>, Daniel Hutchinson demonstrates how to evaluate LLMs as historians would evaluate sources: not only for fluency or accuracy in isolation, but for provenance, distortion, omission, and the ways they reshape historical memory. Taken together, these studies suggest that historians are not merely future users of AI tools; they are helping define what reliable retrieval, evaluation, and interpretation should mean in the first place.</p><p>The selection of the most relevant examples from a large corpus is an obvious example of an instance where historians turn to their own methods when selecting cases, assembling archives, and balancing typical and exceptional examples without collapsing their differences.</p><p>The &#8216;alignment problem,&#8217; often framed as ensuring that AI systems produce trustworthy outputs, also presents a problem of managing agreement and disagreement across documents. At stake is not only accuracy but the structure of knowledge itself. As AI systems scale, they tend to normalize&#8212;to compress disagreement into dominant narratives, privileging what appears most frequent, coherent, or statistically central. Without explicit intervention, AI risks erasing this plurality, producing a flattened account of human experience that undermines democratic reasoning. Embedding historical methods into AI systems&#8212;methods attuned to comparison, contradiction, and the coexistence of multiple valid perspectives&#8212;ensures that technological scale does not come at the expense of epistemic diversity. In this sense, history is not simply a content domain for AI, but a necessary foundation for building systems that can represent the past, and reason about it, in ways that remain open, plural, and accountable.</p><p>Here, too, historians have solutions. Historians have long developed methods for identifying consensus, tracing dissent, and preserving plurality without reducing it to a single dominant narrative. They have worked on preserving small-scale differences of &#8220;little&#8221; details even within the large-scale arc of meaning-making typical of so-called &#8220;big history.&#8221; Historians have shown that what matters most in the record of the past is often not consensus but dissent: competing interpretations, marginalized voices, and unresolved conflicts that shape political and cultural change.</p><p>Beyond AI, the challenges of curating data extend across the disciplines that have engaged with data science. As scholars such as Arthur Spirling have noted, fields like political science increasingly rely on large-scale textual analysis but often lack robust frameworks for ensuring that their corpora are complete and representative.</p><p>This is not a trivial issue. To study labor history, one cannot rely on a single archive or even a handful of sites. One must assemble sources that reflect different classes, migrant and ethnic experiences, institutional perspectives, and temporal changes shaped by major events. The problem is not simply one of scale, but of completeness: bringing together heterogeneous materials into a coherent evidentiary base.</p><p>This challenge extends to law, medicine, sociology, and beyond. The interpretation of large textual corpora depends on assembling archives that are sufficiently comprehensive, balanced, and historically grounded. The expertise required to build and evaluate such archives&#8212;understanding what is missing, what is overrepresented, and how sources relate&#8212;has traditionally been concentrated in history.</p><p>The expertise, in other words, already exists. However, it is dispersed across institutions such as Saskatchewan, SMU, George Mason, Clemson, and Waterloo, rather than integrated into the central infrastructures of major research universities.</p><p>Yet the university as a whole increasingly depends on precisely this kind of knowledge. History can contribute to AI by providing methods for retrieval, evaluation, and the preservation of plurality. It can contribute to the world by enabling large-scale understanding of problems such as climate governance, where decades of negotiation must be analyzed across multiple actors and perspectives. It can contribute to other disciplines by offering frameworks for assembling and interpreting complete and representative archives.</p><p>The next step is clear. What is needed is not another generation of small projects, but a coordinated effort&#8212;a kind of archival and analytical moonshot&#8212;that brings together historians, computational experts, and domain specialists to build comprehensive, multi-source, multi-lingual corpora and the methods to interpret them.</p><p>Such an effort must be led by historians, because the central problem is not simply one of computation, but of judgment: what counts as evidence, what counts as completeness, and how meaning is constructed across time.</p><p>The digital breakthrough will happen when those questions are treated not as peripheral, but as foundational.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/ai-needs-historians-as-much-as-historians?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/ai-needs-historians-as-much-as-historians?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[You Deserve The Cluster]]></title><description><![CDATA[The case for running your historical collections through research computing, and why it costs less than you think.]]></description><link>https://computationalhistory.substack.com/p/you-deserve-the-cluster</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/you-deserve-the-cluster</guid><dc:creator><![CDATA[Loren Moulds]]></dc:creator><pubDate>Mon, 06 Apr 2026 12:49:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!oi8g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12d24228-4741-48cb-96a7-f35d4e2f5b83_998x994.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I have sensed a quiet assumption in the humanities that high-performance computing belongs to someone else: to the physicists modeling gravitational waves or the geneticists sequencing entire populations or to the economists running enormous simulations. These daunting clusters, with their GPUs, their job schedulers, and their cryptic scripts, feel like infrastructure built for disciplines that trade in numbers, not narratives.</p><p>As a historian and archivist, I spent years thinking the same thing. But my projects&#8212;and yours, fellow historian&#8212;might just belong on that cluster too. Consider the handwritten records you&#8217;ve been photographing on research trips; or the correspondence you&#8217;ve been transcribing in the reading room; or those thousand-page minute books, the land surveys, the immigration registers rich with details of space and place. These collections are exactly the kind of work that research computing was built to support. And the tools available right now make the case almost embarrassingly easy.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share Computational History&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share Computational History</span></a></p><p></p><p><strong>What changed?</strong></p><p>A generation of powerful, open-source vision-language models arrived. These are models that can look at an image of a faded, ink-stained, overlapping-cursive index card and return structured, searchable data. Not perfect data. But useful data: the kind that turns a microfilm scroll into a queryable database. They handle handwriting, mixed layouts, rubber stamps, marginal annotations; the visual noise and contextual complexity that defeated traditional OCR for decades. And because they are open-source, they can run on hardware your institution might already own.</p><p><strong>Two projects, one architecture</strong></p><p>My colleagues and I have been testing this across two projects that sit at very different ends of the archival spectrum.</p><p>The first involves over a million government administrative index cards held at a federal archive: cards dense with classification codes, agency stamps, and idiosyncratic cursive spanning decades of bureaucratic correspondence. The second involves handwritten court</p><p>records from the eighteenth century: daily administrative logs of civil litigation, five to fifteen case entries per page, produced by clerks whose penmanship was optimized for speed, not legibility.</p><p>Even though these records come from different centuries, different hands, different archives, the pipelines we built to process them are effectively the same. We take scanned images, feed them to a large vision-language model with carefully designed prompts, and get back structured output in the form of JSON records with fields like dates, agencies, parties, classifications, and body text. The model and pipeline handle the reading and the data extraction. And critically, it runs not on an expensive commercial API but on university research computing nodes equipped with GPUs, using open-source models we downloaded and deployed ourselves.</p><p><strong>You probably already have access</strong></p><p>The barrier to entry is lower than you think. This is the part I want historians to hear most clearly. If you are affiliated with a research university, a library consortium, a national lab, or any institution with a research computing group, you almost certainly have access to GPU-equipped clusters. These machines sit in basements and server rooms running jobs for chemists and engineers, and they have capacity. Research computing groups are, in my experience, genuinely eager to support humanities projects. They want the diversity of use cases and they want to demonstrate broad impact.</p><p>For the index card project, we run a 72-billion-parameter open-source vision model on NVIDIA A100 GPUs through a Slurm-managed pipeline. It processes thousands of cards per hour, around the clock, for the cost of allocated compute time, which, at a university, often means free or close to it. No per-token API fees. No data leaving your institution&#8217;s network. No licensing negotiations. Open-weight models like Qwen, LLaMA, and Mistral can be downloaded and deployed on institutional hardware with nothing more than a few configuration files and some experimentation.</p><p>Compare that to running the same work through a commercial API. At current pricing, processing over a million images through a cloud vision-language model would cost tens of thousands of dollars. On a university cluster, it costs compute time that was already budgeted and even where it isn&#8217;t, the cost is a fraction of the commercial alternative.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oi8g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12d24228-4741-48cb-96a7-f35d4e2f5b83_998x994.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oi8g!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12d24228-4741-48cb-96a7-f35d4e2f5b83_998x994.png 424w, https://substackcdn.com/image/fetch/$s_!oi8g!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12d24228-4741-48cb-96a7-f35d4e2f5b83_998x994.png 848w, https://substackcdn.com/image/fetch/$s_!oi8g!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12d24228-4741-48cb-96a7-f35d4e2f5b83_998x994.png 1272w, https://substackcdn.com/image/fetch/$s_!oi8g!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12d24228-4741-48cb-96a7-f35d4e2f5b83_998x994.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oi8g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12d24228-4741-48cb-96a7-f35d4e2f5b83_998x994.png" width="525" height="522.8957915831663" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/12d24228-4741-48cb-96a7-f35d4e2f5b83_998x994.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:994,&quot;width&quot;:998,&quot;resizeWidth&quot;:525,&quot;bytes&quot;:1885084,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/192767095?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12d24228-4741-48cb-96a7-f35d4e2f5b83_998x994.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!oi8g!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12d24228-4741-48cb-96a7-f35d4e2f5b83_998x994.png 424w, https://substackcdn.com/image/fetch/$s_!oi8g!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12d24228-4741-48cb-96a7-f35d4e2f5b83_998x994.png 848w, https://substackcdn.com/image/fetch/$s_!oi8g!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12d24228-4741-48cb-96a7-f35d4e2f5b83_998x994.png 1272w, https://substackcdn.com/image/fetch/$s_!oi8g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12d24228-4741-48cb-96a7-f35d4e2f5b83_998x994.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Privacy by default</strong></p><p>There is another benefit that matters enormously for the kinds of records historians work with. When you run models on institutional infrastructure, your data never leaves the building. No images uploaded to a third-party server, no extracted text passing through someone else&#8217;s cloud. For projects involving records tied to individual people, families, and communities &#8212; personnel files, medical histories, correspondence, community administrative records &#8212; this is not a minor convenience but an ethical requirement. Running on local compute gives you that by default.</p><p><strong>The real barrier is permission: your own</strong></p><p>The honest obstacle for most historians is not entirely technical. It is the feeling that their project is too small, too niche, too humanistic to justify claiming space on a shared computing resource. Perhaps processing a few thousand court documents does not warrant the same infrastructure that simulates protein folding.</p><p>But consider what is actually happening when you run a model over a collection of historical documents. You are converting unstructured, inaccessible primary sources into structured, searchable, analyzable data. You are building something that other researchers, genealogists, and communities can potentially use. You are doing exactly what research infrastructure exists to support.</p><p>Remember, as well, that the scales vary. Not every project is a million images. Sometimes it is four hundred pages of minute books, or a single box of correspondence. The pipeline scales down just as well as it scales up. A few hundred images can process in an afternoon. And the experience of building that first pipeline &#8212; writing your first extraction prompt, submitting your first batch job, seeing structured records come back from documents you thought were illegible &#8212; changes how you think about what is possible for every collection you encounter afterward.</p><p>Your project is not too small nor too messy. The resources are closer than you think. And the learning curve is shorter than the one you already climbed to get into the archive.</p><div><hr></div><p>Github: <a href="https://github.com/uvalawlibrary/hpc-vlm-starter">https://github.com/uvalawlibrary/hpc-vlm-starter</a></p><div><hr></div><p></p><p></p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em><strong>Learned something? Share with someone.</strong></em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/you-deserve-the-cluster?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/you-deserve-the-cluster?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Agentic Coding for Humanists]]></title><description><![CDATA[Bottlenecks, Side Quests, and the Calculus of Historical Research]]></description><link>https://computationalhistory.substack.com/p/agentic-coding-for-humanists</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/agentic-coding-for-humanists</guid><dc:creator><![CDATA[Cameron Blevins]]></dc:creator><pubDate>Thu, 02 Apr 2026 13:26:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!VPMQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd57e07a-cab1-44fe-900e-c3521677534b_1342x856.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When I was researching my dissertation, I came across an intriguing snippet in an otherwise boring, 900-page government document: a four-page table showing how long it took mail to travel by railway between roughly 130 US cities and twelve major railroad hubs in 1882. This set my wheels spinning: what if I could use this to create a nationwide map of how fast information traveled between different parts of the country and visualize how those connections changed over time?</p><p>Then I sat down to actually do this and realized that it would be a giant pain in the ass. I knew this little side quest was technically possible; I just wasn&#8217;t sure the analytical payoff would be worth the investment of time and skill to actually execute it. So I abandoned my vision. I downloaded a few PDFs of these transit tables into a folder on my computer, where they sat for the next fourteen years.</p><p>Every historian has these kinds of orphaned sources and abandoned side quests. Maybe it&#8217;s a collection of petitions you never had time to transcribe or a city business directory that you never got around to organizing into a spreadsheet. They often get left behind because of a basic calculus: <strong>is the juice worth the squeeze?</strong> Will an uncertain analytical payoff be worth the hours it would take you to transcribe the source or the money it would take to pay a research assistant to do it for you? Even if you <em>did</em> manage to get that directory into a spreadsheet, what then? Maybe you&#8217;d want to look for spatial patterns by overlaying all those businesses onto a historical map of the city. But that would mean learning how to use ArcGIS or finding a way to hire or work with someone who already does &#8211; all with no guarantee that you&#8217;ll find anything of interest. So you move on.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p><p>These two bottlenecks &#8211; the <strong>time and cost of transcribing sources</strong> and the <strong>technical skill to analyze or visualize them</strong> &#8211; have traditionally closed off many avenues of historical research before they ever began. In 2026, that&#8217;s changed. Projects that would have once required hundreds of hours and some combination of technical training, research funds, collaborators, and research assistants are now within reach for an individual researcher with minimal funding and little to no technical training - and they can be done in a fraction of the time. The reason for this shift? An uptick in the &#8220;reasoning&#8221; capacities of Large Language Models (LLMs) and a new generation of AI coding agents.</p><p>To show what I mean, let&#8217;s return to my abandoned mail transit tables that had been sitting on my laptop for the past fourteen years. I recently pulled them up to see if generative AI could help me finally complete my side quest. Over the span of about a month, spread across a half-dozen 1-2 hour sessions, I used Claude Code and Gemini to go from a folder of PDFs all the way to a fully working, interactive web visualization: <a href="https://cblevins.github.io/mail-time/">&#8220;How Fast Was the Mail?&#8221;</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VPMQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd57e07a-cab1-44fe-900e-c3521677534b_1342x856.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VPMQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd57e07a-cab1-44fe-900e-c3521677534b_1342x856.png 424w, https://substackcdn.com/image/fetch/$s_!VPMQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd57e07a-cab1-44fe-900e-c3521677534b_1342x856.png 848w, https://substackcdn.com/image/fetch/$s_!VPMQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd57e07a-cab1-44fe-900e-c3521677534b_1342x856.png 1272w, https://substackcdn.com/image/fetch/$s_!VPMQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd57e07a-cab1-44fe-900e-c3521677534b_1342x856.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VPMQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd57e07a-cab1-44fe-900e-c3521677534b_1342x856.png" width="1342" height="856" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd57e07a-cab1-44fe-900e-c3521677534b_1342x856.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:856,&quot;width&quot;:1342,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:666944,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/192541145?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd57e07a-cab1-44fe-900e-c3521677534b_1342x856.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VPMQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd57e07a-cab1-44fe-900e-c3521677534b_1342x856.png 424w, https://substackcdn.com/image/fetch/$s_!VPMQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd57e07a-cab1-44fe-900e-c3521677534b_1342x856.png 848w, https://substackcdn.com/image/fetch/$s_!VPMQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd57e07a-cab1-44fe-900e-c3521677534b_1342x856.png 1272w, https://substackcdn.com/image/fetch/$s_!VPMQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd57e07a-cab1-44fe-900e-c3521677534b_1342x856.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: center;">Cameron Blevins, <a href="https://cblevins.github.io/mail-time/">&#8220;How Fast Was the Mail?&#8221;</a> (March 26, 2026).</p><p style="text-align: center;"></p><p>The two bottlenecks that had caused me to abandon this source all those years were no longer major barriers: <strong>I didn&#8217;t transcribe any data by hand or write a single line of code myself</strong> during this process. In the rest of this post, I want to walk through how generative AI tools handled these bottlenecks and some of the larger implications for historical research.</p><h2><strong>The Transcription Bottleneck</strong></h2><p>The first step in most digital projects is to take a historical source and turn it into machine-readable data. Traditionally this has required either transcribing it by hand or using <a href="https://ocr.northeastern.edu/report/">Optical Character Recognition (OCR)</a> tools to try to transcribe it automatically. Depending on the source, OCR can be quite challenging. Take my mail transit tables. Beyond the spotty typeface (missing or smudged characters, a <code>3</code> that looks like an <code>8</code>, etc.), these kind of tables present particular difficulties: mess up the alignment for one row or column and that mistake can cascade across the entire document. Traditional OCR tools like ABBY FineReader or Tesseract might see the following:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n1Xs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n1Xs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 424w, https://substackcdn.com/image/fetch/$s_!n1Xs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 848w, https://substackcdn.com/image/fetch/$s_!n1Xs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 1272w, https://substackcdn.com/image/fetch/$s_!n1Xs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n1Xs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png" width="1016" height="311" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/01a78057-5463-4c9c-851c-774675357af9_1016x311.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:311,&quot;width&quot;:1016,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:&quot;1882 Mail Transit Table&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="1882 Mail Transit Table" srcset="https://substackcdn.com/image/fetch/$s_!n1Xs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 424w, https://substackcdn.com/image/fetch/$s_!n1Xs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 848w, https://substackcdn.com/image/fetch/$s_!n1Xs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 1272w, https://substackcdn.com/image/fetch/$s_!n1Xs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: center;"><a href="https://babel.hathitrust.org/cgi/pt?id=mdp.39015063600780&amp;seq=609">United States Official Postal Guide</a> (January 1882), p. 582.</p><p style="text-align: center;"></p><p>&#8230;and understandably transcribe it as:</p><blockquote><p><code>rmont.<br>lontpeller ....&#8226;....... 8 55 18 00</code></p></blockquote><p>This is where Vision Language Models (VLMs) come into play. Traditional OCR tools work through pattern recognition: detect where text appears on a page and then identify individual characters and words. VLMs take a different approach. Rather than processing characters and words in isolation, they integrate visual perception alongside a sophisticated understanding of language and the relationships between words. This allows VLMs to recognize that the above source is a table of cities organized alphabetically by state, that <code>lontpeller</code> falls under the Vermont heading, and that the intended city is therefore likely to be <code>Montpelier</code>. <sup>1</sup></p><p>At least, that&#8217;s the idea. When I first pointed Claude Code to my folder of PDFs of transit tables and told it to extract the data from them, it failed miserably. Rather than using a VLM approach, it kept trying to use Tesseract to perform traditional OCR on the documents. So I pivoted to a different generative AI tool: Google&#8217;s Gemini 3.1 Pro. After some adjustments to my prompts, I was able to upload a four-page PDF of a single transit table and get back a fully usable, accurately transcribed dataset. Not only did Gemini handle things like missing or smudged letters, it was able to recognize irregularities, maintain table alignment, and reformat everything into a single usable spreadsheet for each PDF &#8211; all in just a few minutes. <sup>2</sup></p><p>A sample from Gemini&#8217;s &#8220;model thoughts&#8221; shows the difference between traditional OCR vs. a VLM approach to transcription: </p><blockquote><p><code>I'm now focusing on formatting the data into the "long" format for the CSV, refining the output structure. I will rigorously check OCR outputs, specifically converting "H. M." values to decimal hours, and correcting any potential errors, such as "Little Kock" to "Little Rock."</code> </p></blockquote><p>This isn&#8217;t pattern matching; this is using reasoning to translate a historical source into usable data &#8211; much like a human researcher would do. <sup>3</sup></p><p>In short, Gemini was able to turn this:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n1Xs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n1Xs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 424w, https://substackcdn.com/image/fetch/$s_!n1Xs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 848w, https://substackcdn.com/image/fetch/$s_!n1Xs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 1272w, https://substackcdn.com/image/fetch/$s_!n1Xs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n1Xs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png" width="1016" height="311" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/01a78057-5463-4c9c-851c-774675357af9_1016x311.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:311,&quot;width&quot;:1016,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:&quot;1882 Mail Transit Table&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="1882 Mail Transit Table" srcset="https://substackcdn.com/image/fetch/$s_!n1Xs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 424w, https://substackcdn.com/image/fetch/$s_!n1Xs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 848w, https://substackcdn.com/image/fetch/$s_!n1Xs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 1272w, https://substackcdn.com/image/fetch/$s_!n1Xs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01a78057-5463-4c9c-851c-774675357af9_1016x311.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>&#8230;into this:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dpQQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a56f1ad-e86d-49aa-84f1-5d1cc07c77f3_1452x254.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dpQQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a56f1ad-e86d-49aa-84f1-5d1cc07c77f3_1452x254.png 424w, https://substackcdn.com/image/fetch/$s_!dpQQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a56f1ad-e86d-49aa-84f1-5d1cc07c77f3_1452x254.png 848w, https://substackcdn.com/image/fetch/$s_!dpQQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a56f1ad-e86d-49aa-84f1-5d1cc07c77f3_1452x254.png 1272w, https://substackcdn.com/image/fetch/$s_!dpQQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a56f1ad-e86d-49aa-84f1-5d1cc07c77f3_1452x254.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dpQQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a56f1ad-e86d-49aa-84f1-5d1cc07c77f3_1452x254.png" width="1452" height="254" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4a56f1ad-e86d-49aa-84f1-5d1cc07c77f3_1452x254.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:254,&quot;width&quot;:1452,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:43445,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/192541145?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a56f1ad-e86d-49aa-84f1-5d1cc07c77f3_1452x254.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dpQQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a56f1ad-e86d-49aa-84f1-5d1cc07c77f3_1452x254.png 424w, https://substackcdn.com/image/fetch/$s_!dpQQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a56f1ad-e86d-49aa-84f1-5d1cc07c77f3_1452x254.png 848w, https://substackcdn.com/image/fetch/$s_!dpQQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a56f1ad-e86d-49aa-84f1-5d1cc07c77f3_1452x254.png 1272w, https://substackcdn.com/image/fetch/$s_!dpQQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a56f1ad-e86d-49aa-84f1-5d1cc07c77f3_1452x254.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The technical capacity for accurately transcribing messy historical sources is there, while the usability of these tools is still catching up. My experience with Claude Code &#8211; which defaulted to traditional OCR rather than using a VLM approach &#8211; shows how uneven the landscape still is. But given the pace of change, it feels like a gap that&#8217;s going to close very quickly (if it hasn&#8217;t already). <sup>4</sup></p><h2><strong>The Coding Bottleneck</strong></h2><p>The second bottleneck comes down to the technical skill of coding. Any historian can transcribe a transit table into a spreadsheet; the bottleneck is time and labor, not skill. Building a fully interactive, map-based web visualization that responds to user input and presents temporal shifts in spatial data? Vanishingly few historians have the skill and training to do this. If you are <em>not</em> one of those unicorn historians who knows how to define a CSS hover effect or implement a JavaScript event listener, then up until a few months ago your options were limited. You could find a technical collaborator, get funding to hire a developer, or accept the constraints and limitations of out-of-the-box software like Tableau or ArcGIS StoryMaps.</p><p>This changed in late 2025, when Anthropic released a major new update for the model behind its agentic coding system, Claude Code. Prior to this, <a href="https://en.wikipedia.org/wiki/Vibe_coding">vibe coding</a> (ie. describing what you want in non-technical language and having an LLM write the code for you) necessitated at least <em>some</em> technical fluency, like knowing how to run Python in the command line. Today, that barrier has all but vanished. Someone with zero coding experience can install Claude or OpenAI&#8217;s Codex, write a few prompts, and get back a fully functioning website, visualization, or app a few minutes later.</p><p>When I sat down to test out Claude Code with my mail transit tables, I decided <strong>I was not going to write a single line of code myself.</strong> I wasn&#8217;t going to review code and approve it, I wasn&#8217;t going to tweak variables until something worked, I wasn&#8217;t going to lean on prior experience to debug or catch errors. Instead, I channeled a colleague with zero coding experience. I just described what I wanted, in plain language, and let Claude Code do its thing. This was my very first prompt:</p><blockquote><p><em>There are two files in this folder: a PDF of mail transit times between major cities in 1882 and the transcribed data from that PDF. I want to use the transcribed data to build some kind of interactive app where a user can quickly see and visualize the data.</em></p></blockquote><p>After some further back-and-forth to help clarify what I wanted, Claude Code worked for about 5-6 minutes. When it finished, I opened the HTML file it had created and immediately said &#8220;oh shit.&#8221; Here was a fully working and interactive prototype of that original vision I had fourteen years ago when I first stumbled onto this source:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!elXD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632e6121-d4b1-441d-9e91-c25bf18d5887_1236x962.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!elXD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632e6121-d4b1-441d-9e91-c25bf18d5887_1236x962.png 424w, https://substackcdn.com/image/fetch/$s_!elXD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632e6121-d4b1-441d-9e91-c25bf18d5887_1236x962.png 848w, https://substackcdn.com/image/fetch/$s_!elXD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632e6121-d4b1-441d-9e91-c25bf18d5887_1236x962.png 1272w, https://substackcdn.com/image/fetch/$s_!elXD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632e6121-d4b1-441d-9e91-c25bf18d5887_1236x962.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!elXD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632e6121-d4b1-441d-9e91-c25bf18d5887_1236x962.png" width="1236" height="962" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/632e6121-d4b1-441d-9e91-c25bf18d5887_1236x962.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:962,&quot;width&quot;:1236,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:783723,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/192541145?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632e6121-d4b1-441d-9e91-c25bf18d5887_1236x962.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!elXD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632e6121-d4b1-441d-9e91-c25bf18d5887_1236x962.png 424w, https://substackcdn.com/image/fetch/$s_!elXD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632e6121-d4b1-441d-9e91-c25bf18d5887_1236x962.png 848w, https://substackcdn.com/image/fetch/$s_!elXD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632e6121-d4b1-441d-9e91-c25bf18d5887_1236x962.png 1272w, https://substackcdn.com/image/fetch/$s_!elXD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632e6121-d4b1-441d-9e91-c25bf18d5887_1236x962.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: center;">Initial prototype of <a href="https://cblevins.github.io/mail-time/">&#8220;How Fast Was the Mail?&#8221;</a> (February 13, 2026)</p><p>Watching Claude Code build something like this from scratch felt like sorcery. But I would argue that the <strong>iterative capacities</strong> of agentic coding - the back-and-forth of refining, adjusting, and experimenting that turns a prototype into something you actually want to show people - is even more significant. Iteration has traditionally come with a lot of friction. If you&#8217;re writing the code yourself, iteration means trying to change something, breaking your entire project, and spending hours trying to debug it. If you&#8217;re working with a web developer, iteration means sending over your proposed changes, waiting for them to implement your request, realizing it made everything worse, and sheepishly asking them to walk it all back.</p><p>Vibe coding removes much of the friction of iteration. From my initial prototype, I could easily add additional transit table data from other years, build out more interactive functionality, and make a wide range of design and layout adjustments and tweaks. If something broke, Claude Code was able to fix it. Here is an example instruction I wrote during this process:</p><blockquote><p><em>When you click or select a hub, have all cities show up in a scrollable table on the right. Add a hover function so that when you hover over a city on the map it subtly highlights it in the table and vice versa. Something is wonky with St. Louis when you click it, it turns grey not blue.</em></p></blockquote><p>Above all, I was struck by how Claude Code reduced so many of the pain points of coding by hand. To take one example: a few sessions in, I remembered that the economic historian Jeremy Atack had <a href="https://my.vanderbilt.edu/jeremyatack/data-downloads/">compiled a dataset of historical railroad lines</a> that would be perfect to add to my visualization. If I had been coding this myself, I would have had to wrestle with coordinate systems and map projections &#8211; my personal Achilles heel as a spatial historian that would have left me Googling &#8220;<code>WGS 1984 shapefile</code>&#8221; for the 1,284th time in my career before closing my laptop in frustration. Instead, I just downloaded Atack&#8217;s dataset and told Claude Code to add it to the map. And it did, without a single hiccup. As many others have noted: Claude Code take the drudgery of coding and makes it, well, fun.</p><p>Is the code behind my visualization especially elegant or well-structured? Probably not! I&#8217;m no web developer, but my hunch is that a gargantuan HTML file with 10,600 lines of code is not going to win any awards. But, for my purposes as a researcher, that doesn&#8217;t really matter: it just worked. <sup>5</sup></p><h2><strong>Side Quests</strong></h2><p>One obvious effect of the rise of agentic AI and reasoning models is that the field of digital history is now open to a much wider swath of researchers than ever before. A lot of the end products of digital history &#8211; online exhibits, project websites, maps, charts, network graphs, etc. &#8211; that used to require a certain degree of technical fluency can now be built with Generative AI. <sup>6</sup></p><p>A less obvious effect of these changes is that many more research side quests are now worth pursuing <strong>even when you don&#8217;t know if they&#8217;ll pay off.</strong> Side quests are essential to the research process. Even the dead ends are often valuable: a chance for serendipity, learning new skills, reframing research questions, or stumbling into entirely new lines of inquiry. But side quests come with concrete costs. Historians are constantly making judgement calls about which sources to dig into and which to put aside. Ideally, these decisions are driven solely by the research questions themselves; in practice, they involve the kind of calculations I did as a graduate student when I came across that first mail transit table: was the juice worth the squeeze?</p><p>And guess what? <strong>It wasn&#8217;t!</strong> As far as I can tell, my snazzy transit visualization shows precisely the kind of unsurprising patterns you&#8217;d expect to see: an overall shrinking of informational time-space between 1882 and 1908, lengthy transit lags for the Deep South and patches of the American West, and big leaps in transit gains following the completion of major railroad lines. I could be wrong; maybe something new is hiding in there and some other historian or geographer or communication scholar will find it. <sup>7 </sup>But at first pass, I didn&#8217;t see all that much potential in terms of publishable results.</p><p>If I had decided to execute my original vision back in 2012, I would have wasted a whole lot of time going down an analytical dead end. At best, I would have lost several weeks that I could have spent on other parts of my dissertation. At worst, I might have fallen prey to the sunk cost fallacy and tried to shoehorn this source into my dissertation just because I had invested so much time working on it.</p><p>Fast-forward to 2026. Using Gemini, I could transcribe that same source in a few minutes, at a cost of $0.34. <sup>8 </sup>I could then use Claude Code to quickly visualize the data, start looking for patterns, and iterate back and forth depending on what I found. The same analytical dead end ends up costing several hours rather than several weeks. Moreover, even if the macro patterns from this data aren&#8217;t revelatory, agentic coding meant that I could use that data to easily build and deploy something that might end up being useful for other researchers: a genealogist curious how long it took a letter to travel between their relatives, a historian trying to gauge how isolated one of their subjects was from a major city, or a stamp collector trying to estimate the value of a particular postal cover.</p><p>Breaking some of the traditional bottlenecks to transcription and analysis changes the threshold for whether to pursue a historical research question at all. When the opportunity cost of transcribing a source drops to a few minutes and the cost of a candy bar, the challenge is no longer getting the data &#8211; it&#8217;s whether the questions you can ask with that data are interesting enough to pursue.</p><p>And this is where a different kind of expertise comes into play that has nothing to do with writing code or importing shapefiles: <strong>discernment</strong>. One feature of LLMs is that they&#8217;re eager to give you <em>something</em>. When working with Claude Code, it kept presenting me with what it claimed were interesting patterns in the data. If I were a graduate student, I might have excitedly leapt at these suggestions. But part of becoming an experienced researcher is learning to recognize not just what IS worth digging into but what ISN&#8217;T worth pursuing. So when Claude ran a clustering algorithm on cities based on their connectedness and declared &#8220;the typology shifts are particularly interesting,&#8221; I had the experience to say: &#8220;No, they&#8217;re not.&#8221;</p><p>The old bottlenecks were time and technical skill; the new bottlenecks are judgment, discernment, and taste. This is true for <a href="https://www.andrewbenjaminhall.com/Straus_Hall_Claude_Audit.pdf">many</a> <a href="https://doi.org/10.1038/s41586-026-10265-5">different</a> <a href="https://www.anthropic.com/research/vibe-physics">disciplines</a>, but <strong>history seems especially well-positioned to benefit from this shift</strong>. Accessing and transcribing archival sources has always been a major sticking point for historical research, and the ongoing decimation of federal grant funding has made that bottleneck even more acute. Vision Language Models don&#8217;t just make transcription faster or cheaper - they will (hopefully) expand the <em>kinds</em> of sources we can access and work with.</p><p>Meanwhile, coding remains a vanishingly rare skill for historians. For colleagues in the social sciences and hard sciences, agentic coding dramatically speeds up something they could <em>already</em> do. For historians, agentic coding allows us to do things we <em>couldn&#8217;t</em> do before. Without training in statistical thinking or computational methods, there&#8217;s a risk that we start vibe coding our way to questionable results. But agentic coding in history doesn&#8217;t have to mean pumping out a bunch of faulty regression analyses; it can also mean spinning up an interactive web app to explore a set of sources that have been languishing on your computer for the past fourteen years. By breaking the bottlenecks of transcription and coding, these tools free up time and energy for historians to use the skills we actually excel at: close reading, contextual thinking, narrative, and interpretive nuance.</p><p>Originally posted: <a href="https://cblevins.github.io/posts/bottleneck-side-quests/">https://cblevins.github.io/posts/bottleneck-side-quests/</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/agentic-coding-for-humanists?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/agentic-coding-for-humanists?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p><div><hr></div><p></p><h3><strong>Footnotes</strong></h3><ol><li><p>For more on VLMs, OCR, and historical transcription, see Jacob Carlson, Tom Bryan, and Melissa Dell, <a href="https://aclanthology.org/2024.acl-long.440/">&#8220;Efficient OCR for Building a Diverse Digital History&#8221;</a> (2024) and Noah Dasanaike&#8217;s working paper, <a href="https://www.dropbox.com/scl/fi/2kx17wvydhr3v8m4u59ek/zeroshot_llms_dasanaike.pdf">&#8220;Pre-Trained Language Models as Zero-Shot Tools for Social Science Research&#8221; (2026)</a>.<a href="https://cblevins.github.io/posts/bottleneck-side-quests/#d-footnote-1">[&#8617;]</a></p></li><li><p>As of March 2026, Gemini seems to be the most capable frontier model for computer vision and historical transcription. Although I didn&#8217;t do a rigorous analysis of its error rates, I used Claude Code and some spot-checking to identify potential mistakes in the data. This led me to fix around 12-15 incorrectly transcribed values (across nearly 8,000 records).<a href="https://cblevins.github.io/posts/bottleneck-side-quests/#d-footnote-2">[&#8617;]</a></p></li><li><p>Take these &#8220;model thoughts&#8221; with a grain of salt; they&#8217;re a somewhat imprecise take on the internal processes unfolding within an LLM as it generates its response.<a href="https://cblevins.github.io/posts/bottleneck-side-quests/#d-footnote-3">[&#8617;]</a></p></li><li><p>Lots of historians are working on this - see, for instance, Mark Humphries on <a href="https://generativehistory.substack.com/p/gemini-3-solves-handwriting-recognition">transcribing handwritten 18th-century storekeeper ledgers</a> or Colin Greenstreet on <a href="https://generativelives.substack.com/p/opening-the-ottoman-archive">transcribing handwritten Ottomon sources</a>.<a href="https://cblevins.github.io/posts/bottleneck-side-quests/#d-footnote-4">[&#8617;]</a></p></li><li><p>As Jason Heppler has recently argued in <a href="https://jasonheppler.org/2026/03/09/vibing-digital-history/">&#8220;Vibing Digital History&#8221;</a>, the technical debt critique of vibe coding that applies to large software systems doesn&#8217;t really hold for small, one-off visualizations like this one.<a href="https://cblevins.github.io/posts/bottleneck-side-quests/#d-footnote-5">[&#8617;]</a></p></li><li><p>I agree with Lincoln Mullen, who notes in <a href="https://lincolnmullen.com/blog/behind-ahead/">&#8220;Behind, ahead&#8221;</a> that technical fluency is still a major advantage when it comes to using these tools. There were several times when my experience coding and working with data helped me write better prompts. However, the gap between a &#8220;digital historian&#8221; and a &#8220;non-digital historian&#8221; has closed substantially and will only continue to do so as these tools get better and better.<a href="https://cblevins.github.io/posts/bottleneck-side-quests/#d-footnote-6">[&#8617;]</a></p></li><li><p>Have at it! The raw data is available at: <a href="https://github.com/cblevins/mail-time/blob/main/data/transit-times-merged.csv">https://github.com/cblevins/mail-time/blob/main/data/transit-times-merged.csv</a>.<a href="https://cblevins.github.io/posts/bottleneck-side-quests/#d-footnote-7">[&#8617;]</a></p></li><li><p>I didn&#8217;t actually pay this cost; I was using Google&#8217;s experimental AI Studio, but it helpfully offers a token usage and cost estimate for prompts if you had been using their API. Transcribing each transit table took between 32,000 - 113,000 tokens, with a range of $0.32 on the low end to $1.07 on the high end.<a href="https://cblevins.github.io/posts/bottleneck-side-quests/#d-footnote-8">[&#8617;]</a></p></li></ol>]]></content:encoded></item><item><title><![CDATA[The Lost Promise of Digital History]]></title><description><![CDATA[And why computational history is now needed more than ever]]></description><link>https://computationalhistory.substack.com/p/the-lost-promise-of-digital-history</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/the-lost-promise-of-digital-history</guid><dc:creator><![CDATA[Jo Guldi]]></dc:creator><pubDate>Mon, 30 Mar 2026 13:29:09 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!RP8M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa70459c3-cb58-45c0-82fa-5613b87c4b9c_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>The Lost Promise of Digital History</strong></p><p>Digital history, by most conventional measures, has failed&#8212;just as the university enters an era defined by artificial intelligence and large-scale data. At the very moment when the transformation of university research seems not only possible but inevitable, the humanities appear, from the outside, to be receding. History in particular risks being seen as increasingly marginal: slow, archival, resistant to scale.</p><p>To understand why, it is worth briefly revisiting what digital history has been&#8212;and why it has taken the form that it has. Digital history, by most conventional measures, has failed. But in its ruins, it holds expertise that may yet prove decisive.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RP8M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa70459c3-cb58-45c0-82fa-5613b87c4b9c_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RP8M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa70459c3-cb58-45c0-82fa-5613b87c4b9c_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!RP8M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa70459c3-cb58-45c0-82fa-5613b87c4b9c_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!RP8M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa70459c3-cb58-45c0-82fa-5613b87c4b9c_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!RP8M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa70459c3-cb58-45c0-82fa-5613b87c4b9c_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RP8M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa70459c3-cb58-45c0-82fa-5613b87c4b9c_1536x1024.png" width="484" height="322.7774725274725" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a70459c3-cb58-45c0-82fa-5613b87c4b9c_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:484,&quot;bytes&quot;:2925108,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/192504611?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa70459c3-cb58-45c0-82fa-5613b87c4b9c_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RP8M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa70459c3-cb58-45c0-82fa-5613b87c4b9c_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!RP8M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa70459c3-cb58-45c0-82fa-5613b87c4b9c_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!RP8M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa70459c3-cb58-45c0-82fa-5613b87c4b9c_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!RP8M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa70459c3-cb58-45c0-82fa-5613b87c4b9c_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p><em>A Short History of Digital History</em></p><p>For at least half a century, historians have been talking about &#8220;digital history&#8221; as part of the &#8220;digital humanities,&#8221; a suite of approaches to thinking about data, the internet, and analysis. Twenty to twenty-five years ago, this produced DH centers at many North American universities. These centers were funded to run short, intensive workshops, to expose faculty and students to computational methods, and to build tools and digital collections.</p><p>Many of the digital humanists who participated in these initiatives built museum-style collections of objects online. These collections were often keyword searchable and publicly accessible, but they were not consistently analyzed in ways that produced new forms of historical argument. They were collections&#8212;valuable, often transformative in terms of access&#8212;but still collections rather than cumulative analytical programs.</p><p>Over time, American digital history became closely associated with this model of public-facing collection. Its audience was often K&#8211;12 classrooms, and its outputs were frequently websites, teaching materials, and conference presentations rather than new methods articles or large-scale synthetic works. These interventions mattered. They helped normalize longer temporal frames and more inclusive archives within the discipline.</p><p>The orientation had real value. It expanded access to archives, diversified the materials available for teaching, and introduced new audiences to historical sources. But it also helps explain why digital history, as a research program, did not scale. And that failure (or apparent failure) now matters in a new way. For the questions that digital history did not fully answer are precisely the questions that define the current moment in artificial intelligence.</p><p>As a research program across the university, digital history did not clearly achieve cumulative transformation. It did not visibly reorganize the discipline of History, and one funded project did not consistently lead to another in a compounding way. There was no sustained chain of funding&#8212;no first million leading to a second or third grant, no steady expansion of infrastructure, and no clear institutional scaling of the field.</p><p>Part of this outcome reflects the structure of the humanities, where pedagogy continues to prioritize small seminars and one-to-one mentorship. At the same time, many of the most innovative investments were made at non-elite but highly innovative institutions (including George Mason, SMU, Emory, Nebraska, and Clemson), where there was comparatively limited access to large graduate training pipelines and rarely (if ever) a pool of university-wide postdocs or funding for visiting researchers of the kind that characterizes elite institutions.</p><p>These programs trained a number of outstanding students, many of whom went on to build creative and impactful research agendas. However, those students were not consistently placed into the most influential positions in the discipline, and the methods they developed did not consistently propagate into its institutional core.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p><p></p><p><em>The Underdevelopment of History Departments</em></p><p>Meanwhile, in a reinforcing dynamic, the most prestigious departments&#8212;those responsible for a large share of hiring and agenda-setting&#8212;continued to underinvest in methods. Methods training was often confined to occasional seminars, adjunct instruction, or co-taught courses rather than embedded into a structured curriculum with introductory sequences, advanced coursework, and graduate training pathways. In many cases, departments treated methodological investment as a trade-off against geographic or thematic coverage, rather than as an expansion of the field&#8217;s capacity.</p><p>Funding structures reinforced this pattern. The National Science Foundation did not develop a sustained funding stream for historical computation, and the National Endowment for the Humanities primarily offered shorter-term grants that were not designed to build large-scale, durable infrastructure. There were few fellowships or prizes dedicated to supporting scholars developing new computational methods in history.</p><p>As a result, a number of pioneering works&#8212;such as the first monographs to use GIS or large-scale text mining in historical research&#8212;were recognized but not institutionally consolidated. Their authors often remained in precarious roles, and their contributions did not always translate into stable research programs or widely adopted methods.</p><p>Individual scholars did tremendous work with digital methods to reframe how we see the geography and language of the Enlightenment, to refine our tools for studying place-based relations and journeys like the Grand Tour, and to show how network analysis might give a scholar new glimpses into the dynamics of a community.</p><p>Yet by and large, these advances did not fundamentally restructure the discipline as a whole. Many leading universities continued to treat digital methods as peripheral rather than central, even as broader institutional priorities shifted toward data science, environmental research, and artificial intelligence.</p><p><em>Europe as a Counterexample</em></p><p>In contrast, Europe pursued a different model. At the Luxembourg Centre for Contemporary and Digital History, more than 100 faculty and staff support large-scale research programs, including embedded developers and extensive digitization infrastructure. Projects such as <a href="https://impresso-project.ch/">Impresso</a> enable cross-lingual analysis of newspapers at scale. Institutes like Huygens Institute employ computational specialists to support digital editions and natural language processing pipelines, while networks such as DARIAH and CLARIN establish shared standards across countries.</p><p>These investments have produced concrete outcomes. Scholars such as Ruben Ros, trained in part within the lab of Pasi Ihalainen, have gone on to produce major dissertations&#8212;such as his work at Leiden on parliamentary speech and the history of bureaucracy&#8212;that demonstrate how large-scale textual comparison can generate new historical arguments. Researchers trained by <a href="https://livingwithmachines.ac.uk/">Living With Machines</a> similarly moved into faculty positions, extending these methods into new institutional contexts.</p><p>Indeed, North America also has expertise to offer. However, it is dispersed across institutions such as Saskatchewan, SMU, George Mason, Clemson, and Waterloo, rather than integrated into the central infrastructures of major research universities.</p><p>The difference is not simply scale but continuity. There is digital history expertise in North America, just as there is DH expertise in Europe. But in North America, the necessary steps to institutionalize DH expertise, to ensure the training of the next generation of scholars, to support an ongoing dialogue with Computer Science, have never been taken. The foundations have not been laid; the training programs are generally missing or absorbed into a wider &#8220;digital humanities&#8221; program that neglects the specific skills of history majors. There are simply no North American programs where the scholar can be sent for a specifically historical training in digital methods with the intent of entering the profession or moving into industry to share historical expertise. We need systems train scholars, produce methods, and sustain research programs across multiple funding cycles.</p><p><strong>The time has come for a comparable shift. Not because digital history failed on its own terms, but because the problems it left unresolved&#8212;how to evaluate documents at scale, how to build representative archives, how to move between collection and analysis&#8212;have now become central to the most powerful technologies of our time.</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/the-lost-promise-of-digital-history?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/the-lost-promise-of-digital-history?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p>Check out some of the work of Living With Machines: </p><div id="youtube2-F7QT8QVs3Ic" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;F7QT8QVs3Ic&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/F7QT8QVs3Ic?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div>]]></content:encoded></item><item><title><![CDATA[On the Delegation of Tasks]]></title><description><![CDATA[Or, how to automate pulling documents and data from university and government databases]]></description><link>https://computationalhistory.substack.com/p/on-the-delegation-of-tasks</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/on-the-delegation-of-tasks</guid><dc:creator><![CDATA[Louis Hyman]]></dc:creator><pubDate>Thu, 26 Mar 2026 13:08:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!6BGH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3681aa52-8c51-4c62-af75-9bba4d93f8dc_3176x1730.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Like many of you, I have a long-overdue manuscript. We don&#8217;t need to discuss the ins and outs of why I have burned through many self-imposed and externally-imposed deadlines, that is beside the point. Like you, I have my reasons. </p><p>This morning, I set myself the task of using Codex as a research assistant to push me toward completion.</p><p>I have nearly 200 pages of semi-finished manuscript on the history of e-commerce, yet within those pages are numerous gaps in evidence.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!To4G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ee2516d-55ae-4619-884e-9ad07cee94b2_2840x592.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!To4G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ee2516d-55ae-4619-884e-9ad07cee94b2_2840x592.png 424w, https://substackcdn.com/image/fetch/$s_!To4G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ee2516d-55ae-4619-884e-9ad07cee94b2_2840x592.png 848w, https://substackcdn.com/image/fetch/$s_!To4G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ee2516d-55ae-4619-884e-9ad07cee94b2_2840x592.png 1272w, https://substackcdn.com/image/fetch/$s_!To4G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ee2516d-55ae-4619-884e-9ad07cee94b2_2840x592.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!To4G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ee2516d-55ae-4619-884e-9ad07cee94b2_2840x592.png" width="1456" height="304" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8ee2516d-55ae-4619-884e-9ad07cee94b2_2840x592.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:304,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:218916,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/192120113?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ee2516d-55ae-4619-884e-9ad07cee94b2_2840x592.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!To4G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ee2516d-55ae-4619-884e-9ad07cee94b2_2840x592.png 424w, https://substackcdn.com/image/fetch/$s_!To4G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ee2516d-55ae-4619-884e-9ad07cee94b2_2840x592.png 848w, https://substackcdn.com/image/fetch/$s_!To4G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ee2516d-55ae-4619-884e-9ad07cee94b2_2840x592.png 1272w, https://substackcdn.com/image/fetch/$s_!To4G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ee2516d-55ae-4619-884e-9ad07cee94b2_2840x592.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p></p><p>The manuscript it littered with notes like &#8220;find something here about the Teamsters strike of 1997 and the contract that came out of it.&#8221; I am sure you know what I mean. Some of these gaps are relatively straightforward archival problems that can be solved with a database search. Less obvious are the places where I am making an argument and I should have supporting numbers but don&#8217;t&#8212;like the number of Teamster members in 1997 relative to other union members and the total workforce. Yet whether these searches are easy or hard, they still take time. For assistance, I turned to AI.</p><p>I asked the AI to read my manuscript for both evidence I knew was missing, like the Teamsters contract, and evidence I did not know was missing, like the number of Teamsters. I asked it to produce a table of such gaps in markdown, a format that is easy for both computers and humans to read.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;markdown&quot;,&quot;nodeId&quot;:&quot;9d9e7272-1478-451e-8273-72573fda2fbd&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-markdown">
Read my manuscript and produce a chapter-by-chapter gap analysis in Markdown.

Requirements
- First, read the full manuscript and identify each chapter&#8217;s core argument before listing any gaps.
- Do not edit existing files. Create new files only.
- Save output in: &lt;absolute output folder&gt;
- File name: &lt;YYYY-MM-DD-manuscript-gaps.md&gt;

Inputs
- Manuscript path: &lt;absolute path to manuscript&gt;
- Optional chapter map/TOC path: &lt;absolute path&gt;
- Optional existing evidence index: &lt;absolute path&gt;

Output format (required)
For each chapter:
1. Chapter title
2. Argument summary (3-6 bullets)
3. Evidentiary gaps (numbered, specific claims needing support)
4. Gap IDs in format C&lt;chapter&gt;-G&lt;n&gt;
5. Why each gap matters (causal/quantitative/comparative/historical)
6. Evidence needed (data type, geography, timeframe, source type)
7. Priority (High/Medium/Low)

At the end include:
- Master gap table (gap_id, chapter, claim, evidence needed, priority)
- Cross-chapter themes
- &#8220;Most critical 10 gaps&#8221; section

Quality bar
- Be concrete: identify missing numbers, missing citations, weak causal links, overgeneralizations, and untested counterarguments.
- Distinguish &#8220;no evidence&#8221; vs &#8220;insufficient evidence&#8221; vs &#8220;outdated evidence.&#8221;
- Use manuscript language when possible so I can map gaps back to text quickly.
</code></pre></div><p>Then came the tricky part.</p><p>For the last month, I have been experimenting with AI agents. Agents carry out tasks, like moving files around, running Python code, or anything else computers do. You can give them a task and they can use other programs to accomplish their goals.  Their key feature: they run autonomously until they are &#8220;finished&#8221; (e.g., completing a task, running for ten hours, looping 37 times, or however you define finished). Think of them as henchmen. </p><p>I use ChatGPT&#8217;s Codex, but you could just as easily use Claude&#8217;s Cowork. These agents run on your computer, which allows them to read and write files directly. Scary? To be sure. You should <em>absolutely</em> use a backup like Google Drive or GitHub.</p><p>While I have found many uses for these agents, I wanted to see if I could use them to access my university&#8217;s database system. I hate sitting there for hours running searches, clicking through multiple screens, and downloading PDFs. It takes time that would be better spent reading those PDFs, or actually writing my book.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p><p>AI agents can do this for you. And you can run many, many of them at a time.</p><p>I will give an example using Codex for ChatGPT, which I prefer because it does not run out of tokens nearly as quickly as Claude.</p><p>First, download and install Codex. Then start a new project. ChatGPT will walk you through the setup if needed. You will need to grant it full access, which again feels a bit unsettling, but it is still restricted to your project folder. If you want, you can create a custom set of rules. Just ask Codex to explain how to do that. </p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SkGi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2eebf1-1a7a-41ac-9d0c-059954673238_936x182.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SkGi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2eebf1-1a7a-41ac-9d0c-059954673238_936x182.png 424w, https://substackcdn.com/image/fetch/$s_!SkGi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2eebf1-1a7a-41ac-9d0c-059954673238_936x182.png 848w, https://substackcdn.com/image/fetch/$s_!SkGi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2eebf1-1a7a-41ac-9d0c-059954673238_936x182.png 1272w, https://substackcdn.com/image/fetch/$s_!SkGi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2eebf1-1a7a-41ac-9d0c-059954673238_936x182.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SkGi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2eebf1-1a7a-41ac-9d0c-059954673238_936x182.png" width="936" height="182" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd2eebf1-1a7a-41ac-9d0c-059954673238_936x182.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:182,&quot;width&quot;:936,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22513,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/192120113?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2eebf1-1a7a-41ac-9d0c-059954673238_936x182.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SkGi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2eebf1-1a7a-41ac-9d0c-059954673238_936x182.png 424w, https://substackcdn.com/image/fetch/$s_!SkGi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2eebf1-1a7a-41ac-9d0c-059954673238_936x182.png 848w, https://substackcdn.com/image/fetch/$s_!SkGi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2eebf1-1a7a-41ac-9d0c-059954673238_936x182.png 1272w, https://substackcdn.com/image/fetch/$s_!SkGi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2eebf1-1a7a-41ac-9d0c-059954673238_936x182.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><p>You will also need to install a &#8220;skill&#8221; that allows Codex to use your browser. The one you want is called Playwright Interactive.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3qqC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6ff2e76-7c20-4575-8910-24b5520cfbbd_592x286.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3qqC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6ff2e76-7c20-4575-8910-24b5520cfbbd_592x286.png 424w, https://substackcdn.com/image/fetch/$s_!3qqC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6ff2e76-7c20-4575-8910-24b5520cfbbd_592x286.png 848w, https://substackcdn.com/image/fetch/$s_!3qqC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6ff2e76-7c20-4575-8910-24b5520cfbbd_592x286.png 1272w, https://substackcdn.com/image/fetch/$s_!3qqC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6ff2e76-7c20-4575-8910-24b5520cfbbd_592x286.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3qqC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6ff2e76-7c20-4575-8910-24b5520cfbbd_592x286.png" width="248" height="119.8108108108108" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6ff2e76-7c20-4575-8910-24b5520cfbbd_592x286.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:286,&quot;width&quot;:592,&quot;resizeWidth&quot;:248,&quot;bytes&quot;:19711,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/192120113?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6ff2e76-7c20-4575-8910-24b5520cfbbd_592x286.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3qqC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6ff2e76-7c20-4575-8910-24b5520cfbbd_592x286.png 424w, https://substackcdn.com/image/fetch/$s_!3qqC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6ff2e76-7c20-4575-8910-24b5520cfbbd_592x286.png 848w, https://substackcdn.com/image/fetch/$s_!3qqC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6ff2e76-7c20-4575-8910-24b5520cfbbd_592x286.png 1272w, https://substackcdn.com/image/fetch/$s_!3qqC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6ff2e76-7c20-4575-8910-24b5520cfbbd_592x286.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PWuS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07ed6959-5dbc-4a42-97c9-7f6290ecfb08_936x104.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PWuS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07ed6959-5dbc-4a42-97c9-7f6290ecfb08_936x104.png 424w, https://substackcdn.com/image/fetch/$s_!PWuS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07ed6959-5dbc-4a42-97c9-7f6290ecfb08_936x104.png 848w, https://substackcdn.com/image/fetch/$s_!PWuS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07ed6959-5dbc-4a42-97c9-7f6290ecfb08_936x104.png 1272w, https://substackcdn.com/image/fetch/$s_!PWuS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07ed6959-5dbc-4a42-97c9-7f6290ecfb08_936x104.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PWuS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07ed6959-5dbc-4a42-97c9-7f6290ecfb08_936x104.png" width="618" height="68.66666666666667" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/07ed6959-5dbc-4a42-97c9-7f6290ecfb08_936x104.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:104,&quot;width&quot;:936,&quot;resizeWidth&quot;:618,&quot;bytes&quot;:20633,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/192120113?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07ed6959-5dbc-4a42-97c9-7f6290ecfb08_936x104.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PWuS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07ed6959-5dbc-4a42-97c9-7f6290ecfb08_936x104.png 424w, https://substackcdn.com/image/fetch/$s_!PWuS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07ed6959-5dbc-4a42-97c9-7f6290ecfb08_936x104.png 848w, https://substackcdn.com/image/fetch/$s_!PWuS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07ed6959-5dbc-4a42-97c9-7f6290ecfb08_936x104.png 1272w, https://substackcdn.com/image/fetch/$s_!PWuS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07ed6959-5dbc-4a42-97c9-7f6290ecfb08_936x104.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><p>Once you have done that, you are ready to go. </p><p>First tell Codex which database you want to use. In my case, I used EBSCOhost, which has basically everything. I instructed Codex to read the markdown file of gaps in the manuscript that needed more primary documents. I had to log in to the database, but after that I let it run. While I worked on other things, it pulled and organized hundreds of sources. It created a table that linked each gap to a relevant PDF. Was it perfect? No. Was it working while I did something else? Yes. Did it perform as well as an undergraduate research assistant? Definitely.</p><p>For data gaps, I followed a similar process but did not need a university database. Many government websites provide relevant economic data, often accessible through <a href="https://en.wikipedia.org/wiki/API?wprov=sfla1">APIs</a> that allow direct download. You may need to <a href="https://apps.bea.gov/api/signup/">register for an API key</a>, but for most government sites this is easy and free. I asked Codex to download the relevant time series based on the gaps in the manuscript. There was some back and forth, but the prompt I ended up with worked reliably.  You will need to adapt it to your own project. </p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;79657aa1-1878-4a5d-8122-c7106ef2caaf&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Read /Users/louishyman/Library/CloudStorage/GoogleDrive-lhyman@gmail.com/My Drive/2024-2025/E-Commerce/codex/2026-manuscript-national-metrics-insertion-map.md and pull all labor-related gap data into the existing national metrics pipeline.

Requirements:
1) Focus on labor gaps across Introduction, Chapter 1, Chapter 4, Chapter 4b, Chapter 5, Chapter 9.
2) For every labor time series, include explicit context comparators (denominator, peer sector, or cross-country comparator), not standalone numbers.
3) Prioritize free/public APIs first: BLS v2 (registered), FRED, BEA, Census MRTS, ILOSTAT, OLMS, World Bank, USITC where relevant.
4) Extend historical coverage to 1970 whenever available; otherwise use earliest available and record coverage limits.
5) Reuse stored data and avoid duplicate pulls by default. Only pull missing labor series not already in the store.
6) Update outputs:
   - codex/national_metrics_pipeline/national_metrics_api_pull.csv
   - codex/national_metrics_pipeline/metric_coverage_summary.csv
   - codex/national_metrics_pipeline/context_comparison_audit.csv
   - codex/national_metrics_pipeline/national_metrics_graph_report.md
   - codex/national_metrics_pipeline/charts/*
7) Add/refresh charts specifically for:
   - sectoral wages and employment shares
   - sectoral GDP shares for labor context
   - Teamsters + union/non-union denominators
   - productivity vs compensation
8) At the end, report:
   - which labor gaps are now covered,
   - remaining unresolved labor gaps,
   - exact series IDs/endpoints used per gap.
9) Do not ask me questions; run autonomously and make reasonable assumptions.
</code></pre></div><p>I even had Codex create rough and ready graphs for each data set along the way. These graphs are not production ready (or even classroom ready), but certainly better than nothing. As always, I will double check all these numbers before I make any meaningful inferences from them.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6BGH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3681aa52-8c51-4c62-af75-9bba4d93f8dc_3176x1730.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6BGH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3681aa52-8c51-4c62-af75-9bba4d93f8dc_3176x1730.png 424w, https://substackcdn.com/image/fetch/$s_!6BGH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3681aa52-8c51-4c62-af75-9bba4d93f8dc_3176x1730.png 848w, https://substackcdn.com/image/fetch/$s_!6BGH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3681aa52-8c51-4c62-af75-9bba4d93f8dc_3176x1730.png 1272w, https://substackcdn.com/image/fetch/$s_!6BGH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3681aa52-8c51-4c62-af75-9bba4d93f8dc_3176x1730.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6BGH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3681aa52-8c51-4c62-af75-9bba4d93f8dc_3176x1730.png" width="1456" height="793" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3681aa52-8c51-4c62-af75-9bba4d93f8dc_3176x1730.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:793,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:351092,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/192120113?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3681aa52-8c51-4c62-af75-9bba4d93f8dc_3176x1730.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6BGH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3681aa52-8c51-4c62-af75-9bba4d93f8dc_3176x1730.png 424w, https://substackcdn.com/image/fetch/$s_!6BGH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3681aa52-8c51-4c62-af75-9bba4d93f8dc_3176x1730.png 848w, https://substackcdn.com/image/fetch/$s_!6BGH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3681aa52-8c51-4c62-af75-9bba4d93f8dc_3176x1730.png 1272w, https://substackcdn.com/image/fetch/$s_!6BGH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3681aa52-8c51-4c62-af75-9bba4d93f8dc_3176x1730.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>The easy work is now done, so I can spend my time on the hard work that AI can't do. Between these two approaches, I have assembled a much larger body of documents and data for finishing the book. I still need to find the harder-to-locate primary sources, but now I can spend my time on the hard research problems, not the easy ones. Hours I might have spent pulling PDFs and CSV files I can now spend on thinking and writing, which is much more rewarding than clicking buttons on websites.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p><p></p><div><hr></div><p></p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/on-the-delegation-of-tasks?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Computational History! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/p/on-the-delegation-of-tasks?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/p/on-the-delegation-of-tasks?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p></p>]]></content:encoded></item><item><title><![CDATA[Reading for Structure]]></title><description><![CDATA[It handled the reading; we handled the thinking.]]></description><link>https://computationalhistory.substack.com/p/reading-for-structure</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/reading-for-structure</guid><dc:creator><![CDATA[Loren Moulds]]></dc:creator><pubDate>Mon, 23 Mar 2026 13:15:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!m1In!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8b84db-aeb8-4ae6-9d5d-c86bacc5fdc6_3564x5275.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you wanted to transcribe a handwritten register, you had to transcribe it. Not skim it or glance through it, but sit with the document and work through it line by line. When the handwriting was difficult, you learned the hand. When the spelling was inconsistent, you learned the habits of the writer. If the document ran to hundreds of pages, you prepared yourself for weeks of slow work.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!m1In!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8b84db-aeb8-4ae6-9d5d-c86bacc5fdc6_3564x5275.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!m1In!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8b84db-aeb8-4ae6-9d5d-c86bacc5fdc6_3564x5275.jpeg 424w, https://substackcdn.com/image/fetch/$s_!m1In!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8b84db-aeb8-4ae6-9d5d-c86bacc5fdc6_3564x5275.jpeg 848w, https://substackcdn.com/image/fetch/$s_!m1In!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8b84db-aeb8-4ae6-9d5d-c86bacc5fdc6_3564x5275.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!m1In!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8b84db-aeb8-4ae6-9d5d-c86bacc5fdc6_3564x5275.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!m1In!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8b84db-aeb8-4ae6-9d5d-c86bacc5fdc6_3564x5275.jpeg" width="1456" height="2155" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e8b84db-aeb8-4ae6-9d5d-c86bacc5fdc6_3564x5275.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2155,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:11291516,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/191189852?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8b84db-aeb8-4ae6-9d5d-c86bacc5fdc6_3564x5275.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!m1In!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8b84db-aeb8-4ae6-9d5d-c86bacc5fdc6_3564x5275.jpeg 424w, https://substackcdn.com/image/fetch/$s_!m1In!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8b84db-aeb8-4ae6-9d5d-c86bacc5fdc6_3564x5275.jpeg 848w, https://substackcdn.com/image/fetch/$s_!m1In!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8b84db-aeb8-4ae6-9d5d-c86bacc5fdc6_3564x5275.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!m1In!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e8b84db-aeb8-4ae6-9d5d-c86bacc5fdc6_3564x5275.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Computational History! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Historians have always done this kind of labor, and so have graduate students, research assistants, and archivists. The work can be tedious, but it produces searchable texts, datasets, and indices that allow us to see patterns that would otherwise remain hidden. At the same time, this labor quietly shapes the boundaries of historical inquiry. The sheer effort required to process a document often determines which sources historians pursue and which they leave aside.</p><p>I encountered a document that illustrates this problem while working with historian Philip Herrington. During research at the Henry Ford Archives, Philip came across a guest register from the Hermitage Plantation, just outside Savannah, Georgia. The book records visitors between 1933 and 1935 and runs to 159 pages. Each line contains a date, a visitor&#8217;s name, and a hometown, producing thousands of entries written by hundreds of different hands.</p><p>The register is historically valuable because it offers a rare window into plantation tourism in the early twentieth century. Visitors arrived from across the United States, particularly from northern cities, and their entries reveal patterns of travel and mobility in the Jim Crow era, when plantation landscapes were increasingly repurposed as heritage sites. But the document also presents the kind of practical difficulty historians know well. The handwriting varies widely, some entries are careful while others are hurried, and ditto marks appear where locations repeat. Ink colors change across the pages, and occasionally several people sign on the same line.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QwUy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8e730c9-9d27-490c-af7c-77589c7093de_3907x2685.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QwUy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8e730c9-9d27-490c-af7c-77589c7093de_3907x2685.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QwUy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8e730c9-9d27-490c-af7c-77589c7093de_3907x2685.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QwUy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8e730c9-9d27-490c-af7c-77589c7093de_3907x2685.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QwUy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8e730c9-9d27-490c-af7c-77589c7093de_3907x2685.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QwUy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8e730c9-9d27-490c-af7c-77589c7093de_3907x2685.jpeg" width="1456" height="1001" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8e730c9-9d27-490c-af7c-77589c7093de_3907x2685.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1001,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1890712,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/191189852?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8e730c9-9d27-490c-af7c-77589c7093de_3907x2685.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QwUy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8e730c9-9d27-490c-af7c-77589c7093de_3907x2685.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QwUy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8e730c9-9d27-490c-af7c-77589c7093de_3907x2685.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QwUy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8e730c9-9d27-490c-af7c-77589c7093de_3907x2685.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QwUy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8e730c9-9d27-490c-af7c-77589c7093de_3907x2685.jpeg 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>To answer even simple questions about the register&#8212;where visitors were coming from, whether particular cities appeared repeatedly, how visitation changed over time&#8212;the information had to be structured as data. The traditional approach would be clear enough: transcribe the book, enter each line into a spreadsheet, verify the entries, and standardize the locations. Many historians have undertaken precisely this kind of work. Even at a steady pace, though, the process could easily consume weeks or months of labor, which in practice means that many such documents are never fully transcribed nor eventually made accessible to others.</p><p>Artificial intelligence tools offer a different approach, though not the one that gets the most attention. Writers like to talk about AI &#8220;transforming&#8221; historical research, usually meaning something dramatic about automated interpretation. That overstates the case; an LLM does not interpret the past. What it can do is assist with the mechanical work of recognizing and structuring text&#8212;provided the historian teaches it how to behave.</p><p>LLMs are not designed to read handwritten guest registers. When I first uploaded pages from the Hermitage register without detailed instructions, the model invented names that weren&#8217;t on the page and merged separate entries into single lines. To obtain reliable results, the historian has to provide something the model lacks: instructions grounded in how we actually read these documents.</p><p>In practice this means defining the task with the kind of specificity we would use when training a new research assistant. Each page of the register was scanned and uploaded to a model capable of processing images. Rather than asking the system simply to &#8220;transcribe the page,&#8221; the prompt described the document and its structure in detail. The model was told that it was working with a handwritten guest register and that each row contained a date, a visitor name, and a location. It was instructed not to guess when handwriting was unclear and to flag uncertain entries instead. The output format was specified in advance: a table with the columns Date, Visitor, and Location.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qxRU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a35e519-4763-4d1c-91b5-bbd34ee372ae_4872x4967.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qxRU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a35e519-4763-4d1c-91b5-bbd34ee372ae_4872x4967.png 424w, https://substackcdn.com/image/fetch/$s_!qxRU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a35e519-4763-4d1c-91b5-bbd34ee372ae_4872x4967.png 848w, https://substackcdn.com/image/fetch/$s_!qxRU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a35e519-4763-4d1c-91b5-bbd34ee372ae_4872x4967.png 1272w, https://substackcdn.com/image/fetch/$s_!qxRU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a35e519-4763-4d1c-91b5-bbd34ee372ae_4872x4967.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qxRU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a35e519-4763-4d1c-91b5-bbd34ee372ae_4872x4967.png" width="1456" height="1484" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1a35e519-4763-4d1c-91b5-bbd34ee372ae_4872x4967.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1484,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5535003,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/191189852?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a35e519-4763-4d1c-91b5-bbd34ee372ae_4872x4967.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qxRU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a35e519-4763-4d1c-91b5-bbd34ee372ae_4872x4967.png 424w, https://substackcdn.com/image/fetch/$s_!qxRU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a35e519-4763-4d1c-91b5-bbd34ee372ae_4872x4967.png 848w, https://substackcdn.com/image/fetch/$s_!qxRU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a35e519-4763-4d1c-91b5-bbd34ee372ae_4872x4967.png 1272w, https://substackcdn.com/image/fetch/$s_!qxRU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a35e519-4763-4d1c-91b5-bbd34ee372ae_4872x4967.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Historians will recognize the logic immediately. We explain what kind of document someone is looking at, identify patterns they should expect to see, warn them about ambiguities, and show them the format in which the results should be recorded. The difference is that the &#8220;someone&#8221; here processes a page in seconds rather than minutes.</p><p>Once those instructions were refined, the model began producing usable output. Each page returned tab-delimited text that could be copied into a transcription platform and reviewed by a human editor. The transcription was not perfect: names were occasionally misread and locations sometimes required correction. But starting from a rough transcription rather than a blank spreadsheet changed the nature of the work. Instead of entering thousands of lines manually, we reviewed and corrected what was already there. The process had shifted from creation to verification, and what might have taken months was compressed considerably.</p><p>This leads to a methodological point that historians may initially find uncomfortable. For many research questions, &#8220;good enough&#8221; data is enough. Historians are trained to value fidelity in transcription, and rightly so. Yet the level of precision required always depends on the question being asked. If the goal is a diplomatic transcription of a manuscript, every mark matters. But if the goal is to analyze visitor origins, travel frequency, and geographic networks, then a structured transcription that is mostly correct can be sufficient, especially when the alternative is no transcription at all.</p><p>What keeps this honest is that the historian does the interpretive work. Once the Hermitage register had been transcribed and cleaned, the results became immediately useful. Entries could be sorted by city, mapped geographically, and examined for clusters of visitors from particular regions. Patterns that had been buried in the handwriting of the original ledger emerged clearly once the data had been structured. None of that analysis was performed by the model. It handled the reading; we handled the thinking.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!axa0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa47d6d59-54ed-4c2e-be28-0d3bf0c8f300_6000x4247.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!axa0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa47d6d59-54ed-4c2e-be28-0d3bf0c8f300_6000x4247.png 424w, https://substackcdn.com/image/fetch/$s_!axa0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa47d6d59-54ed-4c2e-be28-0d3bf0c8f300_6000x4247.png 848w, https://substackcdn.com/image/fetch/$s_!axa0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa47d6d59-54ed-4c2e-be28-0d3bf0c8f300_6000x4247.png 1272w, https://substackcdn.com/image/fetch/$s_!axa0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa47d6d59-54ed-4c2e-be28-0d3bf0c8f300_6000x4247.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!axa0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa47d6d59-54ed-4c2e-be28-0d3bf0c8f300_6000x4247.png" width="1456" height="1031" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a47d6d59-54ed-4c2e-be28-0d3bf0c8f300_6000x4247.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1031,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3948505,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/191189852?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa47d6d59-54ed-4c2e-be28-0d3bf0c8f300_6000x4247.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!axa0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa47d6d59-54ed-4c2e-be28-0d3bf0c8f300_6000x4247.png 424w, https://substackcdn.com/image/fetch/$s_!axa0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa47d6d59-54ed-4c2e-be28-0d3bf0c8f300_6000x4247.png 848w, https://substackcdn.com/image/fetch/$s_!axa0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa47d6d59-54ed-4c2e-be28-0d3bf0c8f300_6000x4247.png 1272w, https://substackcdn.com/image/fetch/$s_!axa0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa47d6d59-54ed-4c2e-be28-0d3bf0c8f300_6000x4247.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Seen in this light, tools like these do not promise automated history. They promise a reallocation of effort. Tasks that once required weeks of mechanical transcription can now be accelerated, leaving more room for the questions that brought us to the archive in the first place. For the Hermitage register, those questions are just beginning&#8212;about who visited, why they came, and what a guest book from a plantation-turned-tourist-site can tell us about how Americans in the 1930s chose to encounter the history of slavery.</p><p>Related article: <a href="%22">https://doi.org/10.1080/15420353.2026.2614780</a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Computational History! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[On the Value of Shallow Reading]]></title><description><![CDATA[Or, the AI as Intern]]></description><link>https://computationalhistory.substack.com/p/on-the-value-of-shallow-reading</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/on-the-value-of-shallow-reading</guid><dc:creator><![CDATA[Louis Hyman]]></dc:creator><pubDate>Thu, 19 Mar 2026 13:15:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!QNV4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd28b428b-1b6e-4c0d-a277-a8a0a33a5517_1586x2129.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The hardest part of using AI is knowing what is possible. With agentic coding, the &#8220;technical&#8221; stuff is not as crucial as it used to be. The hardest part is imagining what can be done. For humanists with less experience with computers, knowing what can and can&#8217;t be done is the greatest barrier. We can now make our own tools to solve our own problems.</p><p>One task that the AI can do is shallow reading. Think of the AI as an enthusiastic, but somewhat unlearned, intern. The intern can read the text but not the subtext, so meaning-making is not possible in the way it is for a fully-trained historian. The key point is that AI is not omniscient, but it is fast. And it is cheap. It will not be perfect, nor insightful, but it can still be extremely helpful.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QNV4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd28b428b-1b6e-4c0d-a277-a8a0a33a5517_1586x2129.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QNV4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd28b428b-1b6e-4c0d-a277-a8a0a33a5517_1586x2129.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QNV4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd28b428b-1b6e-4c0d-a277-a8a0a33a5517_1586x2129.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QNV4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd28b428b-1b6e-4c0d-a277-a8a0a33a5517_1586x2129.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QNV4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd28b428b-1b6e-4c0d-a277-a8a0a33a5517_1586x2129.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QNV4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd28b428b-1b6e-4c0d-a277-a8a0a33a5517_1586x2129.jpeg" width="393" height="527.4189560439561" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d28b428b-1b6e-4c0d-a277-a8a0a33a5517_1586x2129.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1954,&quot;width&quot;:1456,&quot;resizeWidth&quot;:393,&quot;bytes&quot;:422372,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/191289392?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd28b428b-1b6e-4c0d-a277-a8a0a33a5517_1586x2129.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QNV4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd28b428b-1b6e-4c0d-a277-a8a0a33a5517_1586x2129.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QNV4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd28b428b-1b6e-4c0d-a277-a8a0a33a5517_1586x2129.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QNV4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd28b428b-1b6e-4c0d-a277-a8a0a33a5517_1586x2129.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QNV4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd28b428b-1b6e-4c0d-a277-a8a0a33a5517_1586x2129.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.archives.gov/research/native-americans/central-classified-files-index">Excerpt from File 45365-1928-810-General Services (National Archives Identifier 300324)</a></figcaption></figure></div><p>First, the intern can transcribe nearly any text. Yes, even handwriting. Yes, even some bad handwriting. Yes, even Latin in Carolingian script. Most importantly, the intern can usefully categorize the text. Even if the intern doesn&#8217;t know who the Marquis de Fancyhat is, it can recognize that this is a proper name, likely a person, possibly French. In computer-speak, this is called &#8220;Named Entity Recognition,&#8221; which identifies names, places, numbers, organizations, and so on.</p><p>Second, the intern can reasonably categorize information on a page. If you ask it to identify property and track property values and addresses, it can do that. If you ask it to geolocate an address, it can do that, though it may be confused by historical differences. Crucially, you can simple say, &#8220;Put the key information on this page into reasonable categories.&#8221; If it encounters a medical injury, it can place it in a &#8220;medical&#8221; bucket. If it encounters a legal case, it can place it in a &#8220;law&#8221; bucket. Will these categories be perfect? No. Will they be approximately as good as a sophomore reading random documents? Yes. Is that better than nothing? Definitely.</p><p>The secret sauce in AI-assisted reading is a file format called JSON (pronounced &#8220;JASON,&#8221; like the Argonauts).</p><p>JSON is a &#8220;human-readable data format,&#8221; which means that you can look at it and understand what is going on. Here is an example based on the letter above.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;json&quot;,&quot;nodeId&quot;:&quot;a51705db-c8d1-4012-8a51-188f2bcfecdb&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-json">{
  "file_summary": [
    "Typed letter to the Commissioner of Education",
    "about reading programs in Indian schools",
    "and a request for help preparing a study outline."
  ],

  "transcription": [
    "Dr. William J. Cooper,",
    "  Commissioner of Education,",
    "  Washington, D.C.",
    "",
    "Dear Dr. Cooper:",
    "",
    "Each year the teachers and other instructors in the Indian schools",
    "make a study of some subject of special importance in connection with",
    "Indian education. Usually one or two of the best books available",
    "which deal with the subject chosen are selected to be read by all",
    "employees and to be discussed at Reading Circle meetings which are held",
    "about once each month during the school year. Teachers are encouraged",
    "to read additional books, reports and articles, the subject matter of",
    "which is related to the topic being studied.",
    "",
    "In some instances, there are those who desire to do the reading",
    "under the direction of college or university representatives in order",
    "that degree credits may be earned.",
    "",
    "It is known that much consideration has been given to this subject",
    "during recent years and that individuals and groups have developed a",
    "great fund of literature which may be of value to those who wish to",
    "study the subject. The difficulty is in the selection of the best in",
    "such form as may be comprehensive and yet not too voluminous for the",
    "year&#8217;s study. To this end, the cooperation of your Bureau is solicited.",
    "It will be necessary to have the outline completed and ready to send to",
    "the field employees not later than May first.",
    "",
    "Sincerely yours,",
    "",
    "(Signed) Chas. H. Burle[?]",
    "",
    "Commissioner.",
    "",
    "2 EP 26",
    "",
    "Carbon for Indian Office.",
    "",
    "[Right margin handwriting/markings: unclear]",
    "[Stamp: FEB 28 19[??] (year unclear)]"
  ],

  "named_entities": {
    "people": ["Dr. William J. Cooper", "Chas. H. Burle[?]"],
    "place": "Washington, D.C.",
    "organizations": ["Indian Office", "Reading Circle"],
    "date": "FEB 28 19[??]"
  },

  "categories": ["government", "education"],

  "document_type": "Typed letter (carbon copy)"
}</code></pre></div><p>Unprocessed text has value, but so does structured text, especially when you are a busy historian sorting through thousands or millions of documents.</p><p>At the simplest level, this structure allows the historian to perform keyword searches by categories of information, such as &#8220;medical,&#8221; rather than searching for &#8220;ankle injuries,&#8221; &#8220;broken leg,&#8221; or &#8220;fever.&#8221; At a slightly more advanced level, it allows the historian to take this structured data and feed it into a spreadsheet (with AI help) to sort and identify documents or patterns. At an even more sophisticated level, though still accessible with AI assistance, JSON files can be stored in a database for more complex analysis. I will write about this in a later post, but you can also ask an AI to teach you how to do it right now.</p><p>For this particular letter, you can see how AI is both strong and weak, but always useful. It correctly captures names, places, and organizations, and does a solid job of transcription, even marking where the text is illegible. If this document were important, you would want to read it yourself. At the same time, it misses the date, which is obscured under a mess of stamps. Even more crucially, it misses how this document fits within the long, complex history of Indian education as a mechanism of control and assimilation. AI excels at transcribing bureaucratic records, but is not particularly adept at understanding them.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Computational History! Subscribe for free to receive new posts.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><p>The larger point is this: you used to need an army of researchers to index and summarize documents. Now you don&#8217;t. You used to need a large grant. Now you don&#8217;t. And this kind of categorization is extremely helpful in making sense of the past.</p><p>AI may be a mediocre reader. But it is fast and cheap, and that makes it powerful.</p><div><hr></div><p>If you would like to try this for yourself, <a href="https://github.com/proflouishyman/computational_history/blob/main/posts/04-ai-as-intern/code/shallow_reading_local.py">the code is linked.</a> Just download, <a href="https://developers.openai.com/api/docs/quickstart">get an API key</a>, and structure away!</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share Computational History&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share Computational History</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Why Historians Need Statistical Thinking]]></title><description><![CDATA[Historical interpretation is always a claim about evidence. Statistical thinking makes those claims more precise.]]></description><link>https://computationalhistory.substack.com/p/why-historians-need-statistical-thinking</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/why-historians-need-statistical-thinking</guid><dc:creator><![CDATA[Christopher Phillips]]></dc:creator><pubDate>Mon, 16 Mar 2026 14:24:55 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!pQ0k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc605bec0-8f81-47f1-ae47-f617a8a00ee0_960x639.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>History has long straddled the divide between the social sciences and humanities. While for much of the past 50 years, particularly with the turn to social and cultural history, the emphasis has been firmly on the interpretive side of the ledger, the increasing amount and kinds of historical data has resurfaced discussion of the field&#8217;s social scientific side. Like all social scientists, historians make interpretations about the actions, intentions, causes, effects, and meaning of people and societies on the basis of data, usually a mix of texts, images, and audio recordings across time and space.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pQ0k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc605bec0-8f81-47f1-ae47-f617a8a00ee0_960x639.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pQ0k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc605bec0-8f81-47f1-ae47-f617a8a00ee0_960x639.jpeg 424w, https://substackcdn.com/image/fetch/$s_!pQ0k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc605bec0-8f81-47f1-ae47-f617a8a00ee0_960x639.jpeg 848w, https://substackcdn.com/image/fetch/$s_!pQ0k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc605bec0-8f81-47f1-ae47-f617a8a00ee0_960x639.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!pQ0k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc605bec0-8f81-47f1-ae47-f617a8a00ee0_960x639.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pQ0k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc605bec0-8f81-47f1-ae47-f617a8a00ee0_960x639.jpeg" width="960" height="639" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c605bec0-8f81-47f1-ae47-f617a8a00ee0_960x639.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:639,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:125842,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/190203651?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc605bec0-8f81-47f1-ae47-f617a8a00ee0_960x639.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pQ0k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc605bec0-8f81-47f1-ae47-f617a8a00ee0_960x639.jpeg 424w, https://substackcdn.com/image/fetch/$s_!pQ0k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc605bec0-8f81-47f1-ae47-f617a8a00ee0_960x639.jpeg 848w, https://substackcdn.com/image/fetch/$s_!pQ0k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc605bec0-8f81-47f1-ae47-f617a8a00ee0_960x639.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!pQ0k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc605bec0-8f81-47f1-ae47-f617a8a00ee0_960x639.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Historians&#8217; data is different from that of sociologists or economists in two key ways, however. First, it is much, much larger&#8212;it encompasses the past, full stop. Historians often don&#8217;t think of their evidence this way, but the entirety of human history is in a meaningful sense fair game for making historical claims. It is entirely common to criticize colleagues for not thinking carefully about excluded evidence&#8212;making a claim about the history of the modern Middle East, for example, by drawing on English-language but not Arabic-language sources, or asserting a truth about early modern politics without taking into account the limitations of state archives, or arguing about the causes of a conflict only using evidence produced by the victors.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Second, we draw on observational evidence, but we cannot run our study twice. That is, unlike an experimental setup in psychology, history is an n-of-1 experiment; we might muse about counterfactuals or what-ifs, but ultimately the Reformation happened once, so we cannot test a theory about what might have been different had a particular event not happened or particular person not been born. Likewise, we only have the evidence that remains to assert what it felt like to live on December 1, 1045 or even December 1, 2025. We cannot simply run the study again with new questions or new variables.</p><p>Historians navigate these limitations on their evidence in part by making assertions about the representativeness and persuasiveness of the evidence they use&#8212;whether a particular statement is typical or unusual; whether an event was unexpected or expected; whether a purported cause is plausible. This is, after all, the normal way academic historians begin our articles, asserting the relevant scope of the problem, laying out what we know already, and explaining how the evidence we draw on will enable us to say something new.</p><p></p><p></p><p>This background is why historians need to engage more directly with statistical thinking. After all, statisticians have spent decades now thinking about what it means to have only a &#8220;sample&#8221; of evidence from a &#8220;population,&#8221; or how observational data differs from experimental data, or how to be precise about what is &#8220;typical&#8221; or &#8220;rare&#8221; or &#8220;unexpected&#8221;. We normally associate statistical processing of historical data with the digital or computational humanities, niche fields for colleagues particularly skilled in computers and quantification. Or approaches that have been relegated to economic history after 1970s-era debates over cliometrics and <em>Time on the Cross</em>. But that&#8217;s wrong. Every claim we make, about the importance of a particular book or event or person; about the causes of change or resistance to change; about the meaning of an abstract concept in the lives of everyday people, requires us to make assertions about the nature of our evidence as a sample of the population of evidence we could have used. And knowledge of statistical ideas helps to make those assertions more precise.</p><p>Take a few basic ideas. Samples, in statistical terms, can be large or small, can be randomly chosen or carefully selected. These differences can, in turn, make big differences in what statisticians can infer about the sample. If you randomly select a sample, then the presumption is that every other sample is comparable in a more meaningful sense than if you hand select a sample. Likewise, statisticians distinguish between different ways of &#8220;reducing&#8221; evidence to a representative example, for example, the mode of a data set (the data point that occurs the most) compared to the median (the middle data point); and have measures of &#8220;spread&#8221; to give a sense of how close the median piece of evidence is, for example, to the most dissimilar pieces of evidence.</p><p>Historians, implicitly, make similar arguments when they assert that a particular passage is typical for a written source, and then describe how different the rest of the source really is. Or when a historian carefully dissects three court cases to show how a particular legal system works but then asserts that the three cases can stand in for all of the rest. Historians may not quantify such assertions, but we do distinguish between an idea that appears a lot and one that is, for example, &#8220;middle-of-the-road,&#8221; or between a treatise with a wide distribution of ideas and one that is narrowly focused.</p><p>Statisticians have also formalized how to interpret &#8220;time series&#8221; &#8211; showing change in some measure over time&#8212;or &#8220;natural experiments,&#8221; when one can compare data before and after some event and make inferences about the effects of the event on that basis. Historians, of course, make arguments along these lines as well. We may argue that the way something is talked about changes gradually over time, for example, or perhaps that the meaning of an idea changes quite quickly and dramatically after some important rupture, like a political revolution. This, implicitly, requires being able to specify and measure (even if qualitatively) what changes and what does not change over time.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aqnJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d52222e-0733-4324-8b5e-103be69514e4_960x640.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aqnJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d52222e-0733-4324-8b5e-103be69514e4_960x640.jpeg 424w, https://substackcdn.com/image/fetch/$s_!aqnJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d52222e-0733-4324-8b5e-103be69514e4_960x640.jpeg 848w, https://substackcdn.com/image/fetch/$s_!aqnJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d52222e-0733-4324-8b5e-103be69514e4_960x640.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!aqnJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d52222e-0733-4324-8b5e-103be69514e4_960x640.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aqnJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d52222e-0733-4324-8b5e-103be69514e4_960x640.jpeg" width="960" height="640" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6d52222e-0733-4324-8b5e-103be69514e4_960x640.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:210017,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/190203651?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d52222e-0733-4324-8b5e-103be69514e4_960x640.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aqnJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d52222e-0733-4324-8b5e-103be69514e4_960x640.jpeg 424w, https://substackcdn.com/image/fetch/$s_!aqnJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d52222e-0733-4324-8b5e-103be69514e4_960x640.jpeg 848w, https://substackcdn.com/image/fetch/$s_!aqnJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d52222e-0733-4324-8b5e-103be69514e4_960x640.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!aqnJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d52222e-0733-4324-8b5e-103be69514e4_960x640.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Similarly, statisticians have been working for well over a century to formalize when two associated things might be said to have a causal relationship. Epidemiologists, in particular, have carefully formalized how associations must be strong, specific, and consistent with other evidence of cause and effect to be deemed causal. It is crucial to rule out alternative hypotheses and confounding variables that may complicate a supposed causal relationship. Sometimes for statisticians these arguments are made using a model, like that of a regression equation, which makes mathematical assumptions about variables in a formalized effort to combine them and make predictions. Part of this process is to think carefully about when it is ok to eliminate a variable from the equation, what kinds of effects are significant and which are insignificant.</p><p>Historians, too, talk about influences, about necessary or sufficient cause, but often do so without any formalization. We talk about contingent and structural causes, but often without thinking carefully about the implicit model of causation we have in mind. We wouldn&#8217;t, of course, expect many historical arguments to be expressible as a mathematical equation, but we do ask why one causal claim is more convincing than others.</p><p>Most of us remember with bleary eyes, or perhaps some degree of recollected dread or boredom, the weeks of Statistics 101 when ideas like medians or regressions were presented. It is not that we need to use the vocabulary of statistics in writing our papers, or that we would expect to formulate a model when making claims about causation. But we are as historians making claims about evidence. Even if our historical argument involves closely reading one source as standing in for a particular understanding of a movement&#8217;s ideas at one point of time&#8212;i.e., if we rely on singularities&#8212;we implicitly are claiming that everything else is irrelevant or of less importance, and statistical ideas would help us think through how to make these claims more precise.</p><p>And for those of us who are wading through thousands of documents, or hundreds of interconnected people, formal mathematical ideas may of course be especially relevant. But that is, and is likely to remain, the exception. The point, for most of us, is that history remains at the intersection of humanistic and social scientific modes of inquiry and even if you find your own way of arguing firmly on the humanities&#8217; side of things, we benefit from knowing enough of the social scientific side to make clear why our arguments are precise and persuasive.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Computational History! Subscribe for free to receive new posts</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Cheap Code]]></title><description><![CDATA[Reproducible, Not Reusable]]></description><link>https://computationalhistory.substack.com/p/cheap-code</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/cheap-code</guid><dc:creator><![CDATA[Louis Hyman]]></dc:creator><pubDate>Sat, 07 Mar 2026 15:13:21 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!YgFG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ed6498-4748-4835-9276-5f0eaab30eed_2649x4254.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Time was, a historian who wanted to see what was in a probate had to read the probate.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YgFG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ed6498-4748-4835-9276-5f0eaab30eed_2649x4254.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YgFG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ed6498-4748-4835-9276-5f0eaab30eed_2649x4254.jpeg 424w, https://substackcdn.com/image/fetch/$s_!YgFG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ed6498-4748-4835-9276-5f0eaab30eed_2649x4254.jpeg 848w, https://substackcdn.com/image/fetch/$s_!YgFG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ed6498-4748-4835-9276-5f0eaab30eed_2649x4254.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!YgFG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ed6498-4748-4835-9276-5f0eaab30eed_2649x4254.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YgFG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ed6498-4748-4835-9276-5f0eaab30eed_2649x4254.jpeg" width="313" height="502.6057692307692" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00ed6498-4748-4835-9276-5f0eaab30eed_2649x4254.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:2338,&quot;width&quot;:1456,&quot;resizeWidth&quot;:313,&quot;bytes&quot;:653368,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/190201168?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ed6498-4748-4835-9276-5f0eaab30eed_2649x4254.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YgFG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ed6498-4748-4835-9276-5f0eaab30eed_2649x4254.jpeg 424w, https://substackcdn.com/image/fetch/$s_!YgFG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ed6498-4748-4835-9276-5f0eaab30eed_2649x4254.jpeg 848w, https://substackcdn.com/image/fetch/$s_!YgFG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ed6498-4748-4835-9276-5f0eaab30eed_2649x4254.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!YgFG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ed6498-4748-4835-9276-5f0eaab30eed_2649x4254.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Recently I was working with a 900-page probate of John McDonogh, an enslaver and landlord in New Orleans who died in 1850. The probate was expensive to produce. Every page was cleanly laid out in a grid, written in an impressive secretary hand. And though I can read such old-fashioned script, doing so is not the same as reading print. Even if it were in print, keeping track of 900 pages of entries and calculations would be taxing. Simply reading the document would take weeks. Copying out the entries for verification in a spreadsheet would be herculean. Even with an ample research budget, subcontracting the work to undergraduate RAs would consume months.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Computational History! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>I had a very basic, but important question: how much of his wealth was in enslaved people compared to his other property?</p><p>To get that answer would have taken months of work. So much work, in fact, that I probably would never have done it.</p><p>In the era of AI, a very different approach is possible. You can write cheap code. You can get the answers that you want.</p><p>By cheap code, I mean hyper-specific, one-off code written for a singular purpose&#8212;like making sense of a 900-page probate. Traditional digital humanities tools required either general-purpose software too rigid for the variation in historical documents, or years of custom development. Only human brains could handle the idiosyncrasies. But today, with AI, you don&#8217;t need general-purpose software for reading probates. You can write software to read <em>just this document.</em></p><p>Here is what I actually did. First I scanned the probate with a CZUR scanner&#8212;about an hour&#8217;s work. Then I used the ChatGPT API to perform OCR and extract the well-structured pages into well-structured JSON. This step required real attention to prompt engineering: I wanted both raw and corrected OCR, French originals alongside English translations, and special attention to the categories I cared about&#8212;land, buildings, enslaved people, financial assets. For each entry I asked for names, value, location, and a best-guess latitude and longitude. I used the batch API to keep costs down. The whole run cost about what you&#8217;d pay for lunch.</p><p>Once I had the JSON, I turned to Claude. I showed it sample files and asked it to write a script that extracted everything into a spreadsheet. That alone was extraordinarily useful for getting a handle on thousands of entries. Next I asked it to verify the arithmetic: on every page, a running total recorded the estate&#8217;s cumulative value, and I wanted to check every page&#8217;s actual summed entries against that figure. There were minor discrepancies but nothing significant. (And I did this same exercise in Notebook LLM with some inaccurate, non-reproducible results. When counting, always use offline code!)</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aCRN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe851568e-6255-4f9d-96eb-baa1ef07220d_1846x900.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aCRN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe851568e-6255-4f9d-96eb-baa1ef07220d_1846x900.png 424w, https://substackcdn.com/image/fetch/$s_!aCRN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe851568e-6255-4f9d-96eb-baa1ef07220d_1846x900.png 848w, https://substackcdn.com/image/fetch/$s_!aCRN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe851568e-6255-4f9d-96eb-baa1ef07220d_1846x900.png 1272w, https://substackcdn.com/image/fetch/$s_!aCRN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe851568e-6255-4f9d-96eb-baa1ef07220d_1846x900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aCRN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe851568e-6255-4f9d-96eb-baa1ef07220d_1846x900.png" width="1456" height="710" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e851568e-6255-4f9d-96eb-baa1ef07220d_1846x900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:710,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:280456,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/190201168?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe851568e-6255-4f9d-96eb-baa1ef07220d_1846x900.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aCRN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe851568e-6255-4f9d-96eb-baa1ef07220d_1846x900.png 424w, https://substackcdn.com/image/fetch/$s_!aCRN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe851568e-6255-4f9d-96eb-baa1ef07220d_1846x900.png 848w, https://substackcdn.com/image/fetch/$s_!aCRN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe851568e-6255-4f9d-96eb-baa1ef07220d_1846x900.png 1272w, https://substackcdn.com/image/fetch/$s_!aCRN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe851568e-6255-4f9d-96eb-baa1ef07220d_1846x900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>From there it was straightforward to generate summary statistics&#8212;mean, median, range&#8212;and visualizations of how the estate&#8217;s value was distributed across asset classes. I could see at a glance exactly how much of McDonogh&#8217;s estate was held in financial instruments, in land, in buildings, and crucially, in people.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g7Z_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a705e41-6a51-4a2a-a44e-46d00339c074_1788x430.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g7Z_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a705e41-6a51-4a2a-a44e-46d00339c074_1788x430.png 424w, https://substackcdn.com/image/fetch/$s_!g7Z_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a705e41-6a51-4a2a-a44e-46d00339c074_1788x430.png 848w, https://substackcdn.com/image/fetch/$s_!g7Z_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a705e41-6a51-4a2a-a44e-46d00339c074_1788x430.png 1272w, https://substackcdn.com/image/fetch/$s_!g7Z_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a705e41-6a51-4a2a-a44e-46d00339c074_1788x430.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g7Z_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a705e41-6a51-4a2a-a44e-46d00339c074_1788x430.png" width="1456" height="350" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a705e41-6a51-4a2a-a44e-46d00339c074_1788x430.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:350,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:114423,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://computationalhistory.substack.com/i/190201168?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a705e41-6a51-4a2a-a44e-46d00339c074_1788x430.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g7Z_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a705e41-6a51-4a2a-a44e-46d00339c074_1788x430.png 424w, https://substackcdn.com/image/fetch/$s_!g7Z_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a705e41-6a51-4a2a-a44e-46d00339c074_1788x430.png 848w, https://substackcdn.com/image/fetch/$s_!g7Z_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a705e41-6a51-4a2a-a44e-46d00339c074_1788x430.png 1272w, https://substackcdn.com/image/fetch/$s_!g7Z_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a705e41-6a51-4a2a-a44e-46d00339c074_1788x430.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The people enslaved by John McDonogh emerged from the record with clarity. Ages. Names. Occupations. Locations. Every individual could be found and examined. New stories could be told of their lives. This is, of course, the moral center of the project: a slaveowner&#8217;s meticulous accounting of human property makes it possible to recover some part of their humanity. At the very least, their names.</p><p>Writing this software took about thirty minutes. Iterating on what I wanted&#8212;a graph of age vs value vs gender &#8212;took another half hour. The code was not elegant, not generalizable, not meant to be shared or preserved. It was cheap code, and that was precisely the point. You could run it and get reproducible results, which is fantastic, but you could not use with another probate.</p><p>The deeper methodological point deserves to be stated clearly. This approach is reproducible but not reuseable. In the pre-AI days, software code was both reproducible and reuseable, and had high costs. For many documents, especially bureaucratic documents, cheap code is not a compromise&#8212;it is the only realistic path to analysis.</p><p>For the first time, I could calculate the complete asset mix of a prominent New Orleans slaveowner: what he held in land, in buildings, in financial instruments, and in human beings. I could do it in an afternoon. The 900-page probate that would once have demanded so many months of my time that I would never do it, gave me the answer in a couple hours. </p><p>The question is not whether to write the scripts, it is which documents we should analyze first.</p><div><hr></div><p>Code: <a href="https://github.com/proflouishyman/computational_history/tree/main/posts/cheap-code/code">GitHub folder</a><br>Method + audit logic: <a href="https://github.com/proflouishyman/computational_history/blob/main/posts/cheap-code/method/AUDIT_LOGIC.md">AUDIT_LOGIC.md</a><br>Repository: <a href="https://github.com/proflouishyman/computational_history">computational_history</a></p><p></p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Computational History! Subscribe for free to receive new posts </p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Coming soon]]></title><description><![CDATA[This is Computational History.]]></description><link>https://computationalhistory.substack.com/p/coming-soon</link><guid isPermaLink="false">https://computationalhistory.substack.com/p/coming-soon</guid><dc:creator><![CDATA[Louis Hyman]]></dc:creator><pubDate>Sat, 08 Nov 2025 12:58:25 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Qigr!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187b411b-680b-41a2-b0f7-e0ea6f262a55_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This is Computational History.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://computationalhistory.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://computationalhistory.substack.com/subscribe?"><span>Subscribe now</span></a></p>]]></content:encoded></item></channel></rss>