Menu 

Stevens Prof Uses Big Data To Transcribe Ancient History

Stevens Prof Uses Big Data

Stevens Institute of Technology recently revealed details about innovative new research being conducted by CS professor and machine learning expert Fernando Perez-Cruz that would analyze and digitize 88 million pages of ancient handwritten documents that might provide insights into questions about “European history, the Conquistadors, New World contact and colonialism, law, politics, social mores, economics and ancestry.”

Perez-Cruz asks, “What if a machine, a software, could transcribe them all? And what if we could teach another machine to group those 88 million pages of searchable text into several hundred topics? Then we begin to understand themes in the material.”

Perez-Cruz hopes to build an “increasingly accurate recognition engine over time” by “teaching software to recognize both the shapes of characters and frequently-correlated relationships of letters and words to each other.” He believes this approach, dubbed ‘interpretable machine learning,’ could be applied to “numerous other next-generation data analysis questions such as autonomous transport and medical data analysis.”

The next step in the process is even more interesting to Perez-Cruz: “the organization of massive quantities of known transcribed data into usable topics at a glance.” He elaborates on how his analysis techniques might interpret three-and-a-half centuries of unstructured data:

“In the end, we might find that there are for example a few hundred subjects or narratives that run through this entire archive, and then I suddenly have an 88-million-document problem that I have summarized in 200 or 300 ideas.”

He concludes, “Once you understand the data, you can begin to read it in a specific manner, understand more clearly what questions to ask of it, and make better decisions.”

regions:

About the Author


Jonathan Pfeffer

Jonathan Pfeffer joined the Clear Admit and MetroMBA teams in 2015 after spending several years as an arts/culture writer, editor, and radio producer. In addition to his role as contributing writer at MetroMBA and contributing editor at Clear Admit, he is co-founder and lead producer of the Clear Admit MBA Admissions Podcast. He holds a BA in Film/Video, Ethnomusicology, and Media Studies from Oberlin College.


Let us find your Program match!!

  • Please only indicate the regions you are interested in pursuing your degree. If you select, "all regions" you do not need to select individual regions.
  • Looking for help? Check the box(es) below!
  • Hidden
  • This field is for validation purposes and should be left unchanged.

Your compare list

Compare
REMOVE ALL
COMPARE
0