Discover how AI decoded in hours what took scholars 50 years—and why some ancient mysteries still resist even our most powerful algorithms.
Chapter One: The Night Alice Kober Changed Everything
Picture this: Brooklyn, 1945. A woman sits at her dining room table, cigarette smoke curling toward the ceiling, surrounded by 180,000 handmade index cards. Each one meticulously cut from examination blue books, old greeting cards, and—let’s be honest—more than a few discretely “borrowed” library checkout slips. Paper was precious in wartime America, but Alice Kober’s obsession with an ancient mystery was worth more than any commodity.
She was hunting something that had eluded the world’s greatest minds for nearly half a century: the meaning behind Linear B, a Bronze Age script discovered on clay tablets among the ruins of Knossos in Crete. These weren’t just any tablets—they were whispers from a civilization that had thrived over 3,000 years ago, speaking in symbols no living person could understand.
The stakes? Nothing less than rewriting our understanding of ancient Greece, pushing back the earliest known examples of written Greek by centuries, and proving that the legendary Mycenaean civilization wasn’t merely a poetic invention of Homer’s imagination.
But here’s where our story takes a distinctly modern twist: What took human geniuses like Kober decades of painstaking manual labor—sorting symbols, identifying patterns, creating grids of phonetic relationships—can now be accomplished by artificial intelligence algorithms in mere hours. And that raises a question that would have seemed like pure fantasy to Kober as she hand-cut her 180,000th index card: Has the age of human-led linguistic archaeology come to an end?
Chapter Two: When the Past Speaks in Riddles
Before we dive into the silicon revolution, we need to appreciate just how extraordinary the human achievement of deciphering ancient languages actually was. Linear B wasn’t just written in an unknown script—it represented an unknown language. Imagine trying to solve a crossword puzzle where you don’t know the language, the alphabet changes randomly, there are no spaces between words, and oh yes, you’re also not entirely sure it’s actually a language at all and not just pretty doodles.
Michael Ventris, the architect who finally cracked Linear B in 1952, described his breakthrough in wonderfully understated British fashion during a BBC radio broadcast: “During the last few weeks, I have come to the conclusion that the Knossos and Pylos tablets must, after all, be written in Greek—a difficult and archaic Greek, seeing that it is 500 years older than Homer and written in a rather abbreviated form, but Greek nevertheless” (as cited in Cambridge Faculty of Classics, n.d.).
What Ventris didn’t mention in that moment of triumph was that his success stood firmly on the shoulders of Alice Kober’s groundbreaking work. Kober had demonstrated that Linear B showed inflection—grammatical endings that changed depending on a word’s function in a sentence—and created the phonetic grid system that Ventris would adopt and expand upon. As Adelaide Hahn wrote in Kober’s 1950 obituary, “if and when this decipherment is ultimately achieved, surely her careful and faithful spade-work will be found to have played a part therein” (as cited in Wikipedia, 2025). Tragically, Kober died at age 43, just two years before the breakthrough, never knowing that her “extremely tentative” work had laid the essential foundation for one of archaeology’s greatest triumphs.
The human decipherment of Linear B required an almost superhuman combination of rigorous statistical analysis, creative leaps of intuition, knowledge of comparative linguistics, and what we might call educated guessing about Cretan place names. It took from 1900—when archaeologist Arthur Evans first discovered the tablets—until 1952 for the code to be cracked. Fifty-two years of brilliant minds battering themselves against an ancient riddle.
Chapter Three: Enter the Machines
Fast forward to 2019. A team at MIT’s Computer Science and Artificial Intelligence Laboratory, led by PhD student Jiaming Luo and Professor Regina Barzilay, did something remarkable. They fed an algorithm words from Linear B and its related language, ancient Greek. The algorithm’s job? Align words from the lost language with their counterparts in the known language.
The result? The algorithm correctly translated 67.3% of Linear B’s words into their modern Greek equivalents—in just two to three hours (Luo et al., 2019). Let that sink in. What took the combined efforts of Evans, Kober, Ventris, and countless other scholars over half a century to accomplish, an algorithm replicated in the time it takes to watch a movie.
But before we start mourning the death of human linguistic archaeology, let’s understand what’s actually happening here. This wasn’t artificial general intelligence achieving consciousness and spontaneously deciding to learn ancient Greek. The MIT team had designed their algorithm based on patterns in how languages evolve over time—insights gleaned from decades of human linguistic research.
As Barzilay explains, “The main challenge in decipherment is the lack of parallel data (such as aligned translations in English and French). Modern machine learning methods utilize millions of such parallel sentences to learn correspondence between vocabularies of the two languages and their grammatical constructions” (ACM, 2018). Traditional translation tools like Google Translate are useless here because they require massive amounts of parallel text—the very thing that doesn’t exist for dead languages.
Instead, the MIT algorithm incorporates linguistic constraints that reflect universal patterns of language change. Languages rarely add or omit entire sounds randomly. Certain sound substitutions are predictable: words with a “p” sound in a parent language might evolve a “b” sound in offspring languages, but are less likely to become a “k” sound due to the significant pronunciation gap (Nexus Newsfeed, n.d.). The algorithm embeds language sounds into multidimensional space where differences in pronunciation are reflected in the distance between corresponding vectors—essentially creating a mathematical map of how languages drift across time.
The team tested their approach on two ancient scripts: Ugaritic (related to Hebrew) and Linear B (related to ancient Greek). For Ugaritic, they achieved a 5.5% improvement over previous state-of-the-art results (Luo et al., 2019). These were languages that humans had already deciphered, which meant the researchers could verify their algorithm’s accuracy. The success suggested a tantalizing possibility: Could machine learning crack codes that have so far resisted all human attempts?
Chapter Four: The Ghosts Still Haunting Us
Walk through the museums of the world and you’ll encounter them: mysterious inscriptions that whisper secrets we still cannot hear. Linear A, the older sibling of Linear B, remains stubbornly undeciphered despite being discovered alongside its younger, more cooperative relative. The Indus Valley script, with its 400+ unique symbols scattered across seals from one of humanity’s earliest urban civilizations, continues to baffle researchers more than a century after its discovery. Rongorongo, the elaborate script from Easter Island, might represent the only independent invention of writing in Oceania—if only we could read it.
These aren’t merely academic curiosities. Dr. Alice Stevenson, a leading archaeologist at University College London, notes: “The absence of a linguistic key translates such scripts akin to solving a complex puzzle without all the pieces” (Nspirement, 2024). Each undeciphered script represents a closed door to an entire culture’s knowledge, beliefs, and daily life.
The computational challenges these scripts present differ dramatically from Linear B. With Linear B, scholars knew (or eventually discovered) that it encoded ancient Greek. This gave them a “known related language” to compare against—the computational equivalent of a Rosetta Stone. But for Linear A? We don’t even know what language family it might belong to. Attempts to decipher it as ancient Greek have all failed.
This is where AI’s limitations become starkly apparent. As Luo’s team discovered when they tried to tackle other mysterious scripts, the cognate-based decipherment approach only works when you can identify a related language. When they tested their algorithm on the Iberian language—historically thought to be related to Basque—the algorithm itself suggested that the two languages were too different to be related, corroborating recent scholarship (MIT CSAIL, n.d.). While Iberian has at least 80 unique symbols, the Indus script has over 400, making it exponentially more challenging.
The MIT researchers developed a follow-up algorithm aimed at identifying possible related languages for undeciphered writing systems—essentially teaching the machine to play the matching game at a higher level (Rest of World, 2023). But even this sophisticated approach hits walls. For the Indus script, we don’t just lack a known related language; we’re not even certain whether the inscriptions represent a full language at all or merely symbolic markers for trade and ritual.
As noted by researchers at Long Now Foundation, “The problem with mystery scripts like Linear A, Cypro-Minoan, Rongorongo, and Harappan is that the total number of known inscriptions can be counted in the thousands, and sometimes in the hundreds. Not only that, in most cases we have no idea what spoken language they’re meant to encode” (Long Now, 2025). Modern AI models learn from vast datasets—the ancient Greek corpus contains tens of thousands of inscriptions, cuneiform tablets number in the hundreds of thousands. When your entire dataset consists of a few hundred short inscriptions with no bilingual texts and no known related language, even the most sophisticated neural networks struggle to find purchase.
Chapter Five: The Ethical Minefield
Here’s where things get philosophically thorny. Let’s say an AI does crack Linear A tomorrow. Whose achievement is it? Who owns the translation? And perhaps most troublingly—how do we know the translation is correct?
These aren’t merely academic questions. They strike at the heart of cultural heritage preservation and intellectual property. Many ancient scripts are deeply tied to specific cultures, and their meanings carry profound implications for descendant communities. When we’re dealing with sacred texts, historical records, or linguistic heritage, the stakes of getting the translation wrong—or right—are immense.
Dr. Timnit Gebru, an AI ethics researcher, cautions that “Artificial Intelligence can transform cultural heritage preservation by providing tools to analyze, restore, and safeguard artifacts and traditions essential to civilizations. However, beneath their digital surface, AI systems raise complex ethical issues directly affecting the heritage they intend to protect” (Nspirement, 2024).
Consider the case of endangered Indigenous languages. Recent projects have used AI to help preserve and translate languages like Owens Valley Paiute. While technologically impressive, these efforts raise profound questions about cultural authenticity and Indigenous rights. As Professor Natalie Stoianoff warns, “What has been happening for quite some time is the misappropriation and exploitation by third parties of Indigenous knowledge and culture. The impact of this has resulted in spiritual, cultural, and economic losses for Indigenous communities worldwide” (Viterbi Conversations in Ethics, 2025).
The risk isn’t just misappropriation—it’s misinterpretation. AI systems excel at finding statistical patterns, but they fundamentally lack the cultural understanding that human translators bring. Subtle nuances, metaphors, and cultural references can be lost when we rely too heavily on algorithmic outputs. An AI might correctly identify that a symbol means “water,” but miss that in context it’s actually referring to a spiritual concept of purification central to an entire belief system.
There’s also the thorny issue of verification. When Ventris deciphered Linear B, other scholars could verify his work by testing his translations against the known phonetic values and comparing results. But if an AI deciphers a completely unknown script, how do we know it hasn’t found a statistically plausible but ultimately incorrect pattern? As one researcher notes, “An AI is only as good as the garbage you feed it. If researchers give it the start from a wrong hypothesis (for example, if they claim that Indus script relates to Sumerian when there is no similarity), the AI will produce a decipherment that looks good, but is wrong” (CraftAIWorld, 2025).
This concern isn’t hypothetical. Over the past century, more than a hundred attempts to decipher the Indus script have been published—linking it to everything from the Rongorongo script of Easter Island to Sumerian cuneiform, with one particularly creative theory offered by a German tantric guru who claimed to have achieved his solution through meditation (Rest of World, 2023). The difference is that an AI-generated decipherment might arrive wrapped in the authoritative cloak of computational certainty, making it harder to question even when it’s fundamentally wrong.
Chapter Six: The Collaboration Imperative
So where does this leave us? Are archaeologists and linguists destined to be replaced by algorithms, relegated to the role of data janitors feeding information into silicon overlords?
Not quite. The most promising vision for the future isn’t AI versus humans—it’s AI and humans working in sophisticated collaboration. As Barzilay notes about future directions, “For instance, we may identify all the references to people or locations in the document which can then be further investigated in light of the known historical evidence. These methods of ‘entity recognition’ are commonly used in various text processing applications today and are highly accurate, but the key research question is whether the task is feasible without any training data in the ancient language” (MIT CSAIL, n.d.).
This represents a fundamental shift in approach. Instead of trying to get AI to completely replace human decipherment, researchers are focusing on how machine learning can augment human capabilities. The algorithm might identify patterns too subtle or numerous for human eyes to catch, suggest possible phonetic relationships, or rapidly test thousands of hypotheses that would take human researchers years to explore manually.
But the final interpretive work—understanding context, making creative leaps, recognizing cultural significance—remains firmly in human hands. As observed in recent scholarship, “AI is the most significant new tool in decipherment since the discovery of the Rosetta Stone. It is not a replacement for the brilliant intuition of a Ventris but a force multiplier for that intuition” (CraftAIWorld, 2025).
The reality is that successful decipherment has always required what Andrew Robinson called “a synthesis of logic and intuition” (MIT News, 2010)—qualities that current AI systems don’t possess. Yes, machines can process data at inhuman speeds and identify statistical patterns we’d miss. But they can’t experience the “eureka” moment that Ventris had when he realized that certain frequently appearing words on Knossos tablets might be Cretan place names—a creative leap that unlocked everything else.
Consider the future of Linear A research. Computational linguist teams are developing Python programs that can execute exhaustive cryptanalytic brute force attacks on Linear A signs—work that would take human scholars lifetimes to complete manually. But as one researcher working on the project admits, “Technology cannot replace human ingenuity and cannot lead automatically to the decipherment of an undeciphered script. But it can save a lot of work and pain to scholars” (Mind Matters, 2025).
Chapter Seven: The Promise and the Peril
The integration of AI into linguistic archaeology represents something more than just faster translations or clever algorithms. It’s changing the fundamental questions we can ask about ancient civilizations.
With AI assistance, researchers can now analyze patterns across thousands of documents simultaneously, identifying cultural exchanges, social hierarchies, and shifts in religious beliefs that would have been obscured by the limitations of traditional methodologies. Machine learning models can contextualize individual symbols within the broader tapestry of ancient societies, drawing connections that were previously invisible.
UNESCO Director-General Audrey Azoulay emphasizes that “every language is a window into a culture’s soul. By preserving and understanding these languages, we preserve humanity’s legacy” (Nspirement, 2024). AI’s pattern recognition capabilities extend beyond linguistics into archaeology, satellite imagery analysis, and the reconstruction of ancient human migration patterns—offering a holistic approach to understanding lost civilizations.
But—and this is a crucial but—this technological revolution comes with responsibilities. We must ensure that AI-driven translations respect the cultural significance of ancient languages and involve collaboration with descendant communities. We need interdisciplinary teams combining computer scientists, archaeologists, linguists, ethicists, and cultural historians to ensure that AI deployment respects the complexities of cultural heritage.
The establishment of specialized ethical frameworks for AI in cultural heritage conservation isn’t optional—it’s essential. As one recent analysis emphasizes, “It will be the synergy between computational prowess and human wisdom that will ensure our cultural legacies are preserved for future generations to explore, understand, and appreciate” (International Journal of Emerging and Disruptive Innovation in Education, n.d.).
Chapter Eight: The Road Ahead
So what does the future hold? Will AI crack Linear A? Will it finally decode the Indus Valley script and reveal the secrets of one of humanity’s earliest urban civilizations? Will Rongorongo surrender its mysteries to machine learning?
The honest answer is: maybe. The technology is advancing rapidly. Algorithms are becoming more sophisticated. Datasets are growing. The Vesuvius Challenge has shown that machine learning can extract Greek text from carbonized scrolls that have been unreadable for nearly 2,000 years—a feat that seemed impossible just a few years ago.
But the deeper truth is that some mysteries might resist even our best computational tools. The way forward depends not just on better algorithms, but on finding more inscriptions through traditional “dirt” archaeology. The best-case scenario would be discovering a bilingual text—a modern Rosetta Stone that provides the parallel data AI algorithms need to function effectively.
As researchers at the Long Now Foundation note, “The code of Cypro-Minoan, or Linear A, or the quipu of the Andes, won’t be cracked by a computer scientist alone. It’s going to take a collaboration with epigraphers working with all the available evidence, some of which is still buried at archaeological sites” (Long Now, 2025).
Perhaps the most important insight is this: AI doesn’t make human expertise obsolete—it makes it more valuable. The algorithms need to be designed by people who understand historical linguistics. The results need to be interpreted by scholars who grasp cultural context. The ethical frameworks need to be established by communities who understand what’s at stake.
The story of AI and ancient language decipherment isn’t really about machines replacing humans. It’s about humans building tools that extend our capabilities, allowing us to ask bigger questions and explore deeper mysteries than ever before. Alice Kober, sitting at her dining table with her hand-cut index cards, would probably appreciate that. She understood that sometimes you need to build the right tool before you can solve the puzzle.
The voices of lost civilizations are still whispering to us from clay tablets, stone monuments, and ancient manuscripts. Now we have new tools to help us listen. Whether we hear them correctly—whether we preserve their cultural integrity while unveiling their secrets—will depend not on our algorithms, but on our wisdom in using them.
The age of unsolved mysteries isn’t over. If anything, it’s just getting started.
References
- ACM. (2018). People of ACM – Regina Barzilay. Association for Computing Machinery. https://www.acm.org/articles/people-of-acm/2018/regina-barzilay
- Cambridge Faculty of Classics. (n.d.). The decipherment of Linear B. University of Cambridge. https://www.classics.cam.ac.uk/system/files/documents/process.pdf
- CraftAIWorld. (2025, September 6). AI decoding ancient languages: Cracking lost scripts in 2025. CraftAIWorld Blog. https://craftaiworld.com/blog/ai-decode-ancient-languages
- International Journal of Emerging and Disruptive Innovation in Education. (n.d.). AI integration in cultural heritage conservation – Ethical considerations and the human element. https://digitalcommons.lindenwood.edu/cgi/viewcontent.cgi?article=1022&context=ijedie
- Long Now Foundation. (2025, November 5). The codes AI can’t crack. Long Now Ideas. https://longnow.org/ideas/the-codes-ai-cant-crack/
- Luo, J., Cao, Y., & Barzilay, R. (2019). Neural decipherment via minimum-cost flow: From Ugaritic to Linear B. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3146-3155. https://doi.org/10.18653/v1/P19-1303
- Mind Matters. (2025, April 9). Can AI really decode dead languages? Or is it too late? https://mindmatters.ai/brief/can-ai-really-decode-dead-languages-or-is-it-too-late/
- MIT CSAIL. (n.d.). Translating lost languages using machine learning. Massachusetts Institute of Technology. https://www.csail.mit.edu/news/translating-lost-languages-using-machine-learning
- MIT News. (2010, June 30). Computer automatically deciphers ancient language. Massachusetts Institute of Technology. https://news.mit.edu/2010/ugaritic-barzilay-0630
- Nexus Newsfeed. (n.d.). New AI algorithm is cracking undeciphered languages. https://nexusnewsfeed.com/article/ancient-origins/new-ai-algorithm-is-cracking-undeciphered-languages/
- Nspirement. (2024, October 26). How AI is translating ancient languages. https://nspirement.com/2024/10/26/ai-is-translating-ancient-languages.html
- Rest of World. (2023, April 4). An ancient language has defied decryption for 100 years. Can AI crack the code? https://restofworld.org/2022/indus-translation-ai-code-script/
- Viterbi Conversations in Ethics. (2025, March 3). Preserving the past: AI in Indigenous language preservation. University of Southern California. https://vce.usc.edu/weekly-news-profile/preserving-the-past-ai-in-indigenous-language-preservation/
- Wikipedia. (2025, September 9). Alice Kober. https://en.wikipedia.org/wiki/Alice_Kober
Additional Reading
- Fox, M. (2013). The riddle of the labyrinth: The quest to crack an ancient code. Ecco Press.
- Robinson, A. (2002). The man who deciphered Linear B: The story of Michael Ventris. Thames & Hudson.
- Chadwick, J. (1992). The decipherment of Linear B (2nd ed.). Cambridge University Press.
- Assael, Y., Sommerschield, T., & Prag, J. (2022). Restoring and attributing ancient texts using deep neural networks. Nature, 603, 280-283.
- UNESCO. (2023). Inteligencia artificial centrada en los pueblos indígenas: Perspectivas desde América Latina y el Caribe. UNESCO Publishing.
Additional Resources
- MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) https://www.csail.mit.edu Leading research on computational linguistics and ancient language decipherment
- Program in Aegean Scripts and Prehistory, University of Texas at Austin https://sites.utexas.edu/scripts/ Houses Alice Kober’s archives and conducts ongoing research on Bronze Age scripts
- Long Now Foundation – Rosetta Project https://rosettaproject.org Global collaboration focused on language preservation and documentation
- DeepMind Ithaca Project https://deepmind.google/discover/blog/predicting-the-past-with-ithaca/ AI system for restoring, dating, and attributing ancient Greek inscriptions
- UNESCO Intangible Cultural Heritage https://ich.unesco.org Resources on ethical frameworks for AI in cultural heritage preservation


Leave a Reply