Matthew Jockers on Topic Modeling

This week, I attended Dr. Matthew Jockers’ talk “Correlating Theme, Geography, and Sentiment in the 19th Century Literary Imagination,” which was yet another great Catapult Center event at IU.

Dr. Matthew Jockers, a leader in text analysis from University of Nebraska, discussed his latest research utilizing topic modeling for geometrization of narrative, that is, the literary function of “place,” in conducting a macroanalysis on over 3,500 British and Irish novels. Topic modeling allows to solve some of the problems with identifying places caused by ambiguity in texts thanks to its name and entity recognition (NER) capabilities, such as shared name places (such as Georgia, which is a country as well as state in the US) or places as concept. Additionally, using LDA for developing word clusters and differentiating contexts of words, Dr. Jockers was able to get a larger sense of place, not in terms of coordinates, but in “placeness.” Primarily interested in representations of place, Dr. Jockers found interesting commonly addressed themes in his data set, including “peasant dwellings,” “war victories,” each discussed under difference words depending on Irish or English author or audience. Generally, Irish spoke more positive of home, while British depicted themselves with a sense of superiority and conversely the Irish with wretchedness. As Dr. Jockers points out, these are macro, general tendencies and so his findings should not be taken to speak for all people’s perspectives.

The greatest lessons learned from Dr. Jockers’ talk, aside from the fact that he enjoys analogies and metaphors related to food when it comes to text analysis or anything for that matter, are 1) that topic modeling via text analysis allows for further confirmation on what one already knows and 2) the process facilitates discovery of new categories of analysis and research questions (for instance, Dr. Jockers found that Irish-Americans were less sympathetic toward free Blacks in the nineteenth century, largely because of employment competition, while Irish people in Ireland related to them, probably because of understanding oppression!).