| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

The Benefits of the Topic Modeling Tool: An Analysis of the Motives for Immigration

Page history last edited by Evelyn Ramirez-Mancilla 9 years, 4 months ago

Evelyn Ramirez-Mancilla

Professor Alan Liu

English 149

December 15, 2014

 

The Benefits of the Topic Modeling Tool: An Analysis of the Motives for Immigration 

 

     The Digital Humanities is a field of research that incorporates technology and the scholarly work used in various humanities departments. In the book, Digital_Humanities authors Anne Burdick, Johanna Drucker, Peter Lunenful, Todd Presner and Jeffery Snap believe that  “[Digital Humanities] asks what it means to be a human being in the networked information age and to participate in fluid communities of practice, asking and answering research questions that cannot be reduced to a single genre, medium, discipline, or institution. [...] It is a global, trans-historical, and trans media approach to knowledge and meaning-making.” (vii).  In an attempt to improve the knowledge of digital humanity tools, undergraduates Evelyn Ramirez-Mancilla and Julissa Villatoro have designed the latest digital humanities research project at the University of California Santa Barbara. The project, Topic Modeling Tool Analysis (TMTA): Motives for Immigration, explores the benefits of using the Topic Modeling Analysis Tool in research.

     A major portion of the project is to understand the topic-modeling tool. The topic-modeling tool is a system that uses the java program Mallet to gather clusters of words in a collection of text. The clusters represent the topics of the text.  The tool displays these topics according to the data entered. If you choose the number ten, you will receive a model with ten rows. Each row “can be understood as a collection of words that have different probabilities of appearance in passages discussing the topic” (Underwood, 1).  While becoming familiar with the tool, the project group realized that each member read the topics provided by the tool differently.  Since interpretation played such a large role in understanding the data from the tool, the team set out to discover a way to make this a beneficial discovery for research projects like this one. As an English and Communications major Ramirez-Mancilla is constantly analyzing text and was aware that biases can become barriers for extracting information.  Using this information, the team members wanted to choose a research topic that would generate opinions and prejudices immediately. The project group decided to focus on immigration because of the recent discussion and publication of immigration in the media and in print over the last few years.  The project group assumed that the group of undergraduates they would be presenting their information too would all have a preconceived opinion on immigration. As a Chicana/o Studies major, Villatoro believed that the words, “Motives for Immigration” would develop stronger opinions based on these prejudices.  

     The collection of data inputted in the Topic Modeling Tool was from three books written by Hispanic authors.  The nonfiction text, Amigas: Letters of Friendship and Exile by Marjorie Agosin and Emma Sepulveda, December Sky: Beyond my Undocumented Life by Evelyn Cortes, and I Rigoberta Menchu: an Indian Woman in Guatemala by Rigoberta Menchu all provide historical background on the narrator’s country of origin but, also describes their life once they immigrated to the United States. From each book, the project group extracted the chapter that described the moment when the narrator decided to immigrate. The team group chose these parts of the narrative because they felt it expressed the collection of motives to immigrate the best. As a result, for each data set created from theses chapters, the team members/interpreters had to answer the question, what were the motives that caused each narrator to immigrate?

The group decided to choose three motives that the words produced by the Topic Modeling Tool could be associated with. The categories were Violence, Political/Military, and Economic. Each interpreter had to circle any word in the Topic Modeling Tool data they felt was associated with violence in orange. If the word were associated with a Political/Military motive it would be circle in red and if it were Economic it would be green. In addition to these categories, the group added a category for Semantics in yellow. The purpose of the Semantics category was to identify the descriptive words in the Topic Modeling data. Once identified, the team members had to associate these words with the three categories by adding a line in the color of the motive the interpreter felt was most clearly defined. This allowed the each team member to create a visual out of the Topic Modeling Tool data to support the answer to the research question.

     In addition, since the Topic Modeling Tool develops data based on patterns the project group decided to analyze these pattern through two lenses to challenge the prejudices developed by the question. First a person who had read the text and had historical background interpreted the data. Then some one who knew nothing more than the title interpreted the data. Through these two interpretations we were able to see how prejudices became prevalent in the explanations for no prior knowledge of text. The project group used the two interpretations to challenge the validity of each.  The no knowledge interpretation made the interpreter with knowledge of the text question if they truly understood the motives for the narrators immigration or if they were just fitting the narrative to a historical timeline. The interpretation with knowledge of the text challenged the other’s interpretation by questioning if the data was just being fitted to their strong opinion. Through this discussion the project group is attempting to produce an accurate analysis of text and data by challenging the interpretations for the unbiased patterns developed by the Topic Modeling Tool.

     In the discussion of the Topic Modeling Tool data for, I Rigoberta Menchu: an Indian Woman in Guatemala both interpretations had completely different rational for Rigoberta’s decision to leave Guatemala. The visual created by the team member with no knowledge of the text argued that Rigoberta’s dominant motive for immigration was Economic.  In the data, this team member circled words like “poor”, “orphan”, and “work”. They made the argument that because Rigoberta was part of a lower social class (as an orphan) she lived in extreme poverty and immigrated in search for a better lifestyle. Meanwhile, the team member with prior knowledge argued that Rigoberta left for Political/Military reasons. This person identified words like, “leftist”, “government” and “orders”.  This person was aware that at this time Guatemala was in a civil war and Rigoberta was caught in between it. Ideally, the next phase of this project would be for the members who have not read the text to look for scholarly sources that historically support the motives they feel caused these females to immigrate. The other team member would reread the text considering the other persons motive in the text. Afterwards, the groups would see if the new information they gathered truly supports either motive. The project group hypothesizes that it will be rare when a person with no knowledge of the text has a more accurate interpretation of the text/ research question. In this case, the person who believed Rigoberta left because of economic reasons was unaware of the context around the word “orphan”.  Through the second discussion they became aware that Rigoberta became an orphan because the political system was not content with her involvement and killed her family. The idea of having each person reevaluates their answers through this process is beneficial to any research team because it allows people to truly see why their answer is not as plausible. The research group feels that this will decrease the amount of problems among large numbered groups because it eliminates the belief that “my answer is wrong just because she doe not like it or does not care to look at the text from this perspective”.

     Overall the methodology of this project is to help research groups and scholars in general develop strong arguments for their analysis of a text. This group chose to use the motives for immigration as a topic of interest to expand on if they had more time. This topic was picked because it is controversial and its controversy can be reflected in the discussion of a text that analyzes this topic.  The Topic Modeling Tool was a medium that promoted positive discussion on this topic because it allowed each member to challenge his or her response instead of relying on the person with the most expertise on the topic. This welcomes the varying levels of knowledge in any group. It allows production and eliminates feeling of inferiority because it each member is responsible to justify and reevaluate his or her answers.  This is often difficult to identify or do when you focus only on a close reading of the text. The topic Modeling Tool is also beneficial because it makes the argument for any research answer stronger because it allows the research team to identify early on the ways their answer maybe challenged.  In the context of the research topic the group concluded that discovering the true reasons behind each narrator’s motives for immigrating is extremely difficult but the way this project group used the Topic Modeling Tool allowed them to come to a conclusion they feel is strong. Ultimately each narrator left their country of origin for a combination of all the motives specifically the fear of a violent death and poverty inflicted by a political entity.

 

 

 

Comments (0)

You don't have permission to comment on this page.