The Algorithmic Review - AI in Social Sciences: How Large Language Models are Reshaping Text Analysis

Artificial Intelligence (AI) has transformed academic research, especially in the past few years since user-facing AI became publicly available. Today, numerous AI technologies support scholars across all fields and at every stage of the research process. Researchers worldwide have varying levels of access to AI tools that assist them with a range of academic tasks, from literature review to data analysis. As a result of these automated methods, demanding activities that previously involved long hours in the library can now be completed in a few clicks.

User-friendly large language models (LLMs) with chat interfaces, such as ChatGPT, have taken academia by storm. LLM-powered chatbots have made complex generative AI accessible to many people, revolutionizing how scholars conduct research. These systems are being used as assistants in research, performing complex tasks like academic writing, brainstorming ideas, gathering feedback, or suggesting methodologies. The use of LLM-chatbots in academia has reached a point where ChatGPT has been credited as co-author in research papers, raising controversies and leading some academic publishers to establish guidelines for the ethical use of AI in scientific publications, while others have completely restricted its use.

There is no doubt about AI’s potential to transform the way we conduct research. Precisely because of this, it is essential to ask how and to what extent we should use AI for academic work. These tools can improve efficiency for researchers, but they also carry risks, particularly the danger of overestimating the veracity and impartiality of AI-generated information. Not all AI-produced outputs are reliable. These systems can sometimes “hallucinate”, producing false or misleading information. Without proper human oversight, over-reliance on AI’s outputs in scientific research can lead to inaccurate or fraudulent results, ultimately affecting trust in scientific research and jeopardizing researchers’ critical thinking skills.

While generative AI’s use in scientific research continues to be debated, I want to emphasize the potential that LLMs represent for social sciences, particularly for text analysis, based on my own research experience.

AI for Social Research: Incorporating LLMs for Text Analysis

LLMs are advanced AI systems designed to recognize and generate human language. These models are trained on large amounts of text to find patterns and relationships in natural language. As a result, LLMs can perform human language processing tasks, such as summarizing texts, answering questions, and having conversations.

Given their sophisticated ability to comprehend language, LLMs are an advantageous tool for text analysis, a research method used to study written or spoken language to uncover underlying meanings and interpret content. Traditionally, text analysis has relied heavily on manual work, with researchers reading through texts to label and organize information manually to draw inferences. Existing specialized software can support some parts of this process through semi-automated tools designed to manage large volumes of text, facilitate word counting, detect common themes, or create visual summaries. However, these tools remain limited in grasping the deeper meaning of language, leaving complex semantic understanding assignments still mainly dependent on human interpretation.

This is where LLM's capacity for reliable semantic analysis stands out. These models’ ability to quickly process large amounts of text makes them a promising and cost-effective tool for text analysis. LLMs excel at tasks like detecting emotions, identifying opinions, recognizing hate speech, and detecting topics, among other assignments. LLMs can accurately perform these tasks “from scratch”, in what is called a “zero-shot setting,” meaning they don’t need any pre-labeled examples or training data to understand and classify text. Without LLMs, these analyses would be conducted mostly manually by human annotators who label text into specific categories.

Thanks to their refined semantic understanding, LLMs can follow specific short instructions or “prompts” to answer targeted questions about text content, making them suitable to perform analytical tasks like text classification and categorization. For example, for my doctoral research, I used an LLM to perform simple yes/no coding of newspaper articles reporting cases of gender-based violence -asking whether each article explicitly mentioned the perpetrator or not. While creating effective and clear prompts requires a degree of expertise (known as ‘prompt engineering’), based on my experience, designing prompts for data labeling in LLMs is still more efficient than manually annotating texts. LLMs have shown high accuracy and reliability, often matching and outperforming human annotators in various tasks. They can also work in multiple high-resource languages, increasing their usefulness for broader research.

While LLMs' performance will vary depending on the type of task and quality of the prompt, overall, they have the potential to make text analysis more efficient by reliably examining larger quantities of text in less time. To illustrate with a personal example, for my research, I manually classified 25 newspaper articles in about three hours. Contrastingly, GPT-4o automatically completed the same task for 1,054 articles in just six hours (more than 20 times faster than I did!).

Although LLMs offer powerful and versatile tools for text analysis and classification, there are key considerations to keep in mind when incorporating them into social research. These include, but are not limited to:

Human supervision and data validation: LLM-generated outputs should always be reviewed and validated by humans. For data validation, LLM results must be compared to human judgments to assess their level of agreement through reliability measures, such as Krippendorff’s Alpha or Cohen’s Kappa.
Limited applicability for low-resource languages: LLMs are usually trained on high-resource languages, due to the availability of digital data. Therefore, their use and performance in low-resource languages, for which dataset availability is restricted, can be weaker and more limited.
Prompt quality: The way instructions (prompts) are written greatly impacts LLMs' performance. Prompts should include clear and precise instructions and, ideally, illustrative examples (what is technically known as “few-shot learning”) to reduce ambiguity and bias associated with LLMs.
Privacy and data protection: When using LLMs, the data often passes through servers in various locations worldwide, raising questions of data privacy and compliance with data protection laws. Therefore, when analyzing sensitive texts, it is fundamental to anonymize the data.
Challenges with reproducibility: Due to their constant updates and their stochastic nature, LLMs do not generate the same answers, even with identical input. This limits the ability to replicate and reproduce the results. Thus, researchers should always report the LLM version and the date of analysis.
Technical skills and costs: Using LLMs for text analysis requires programming knowledge and technical expertise. There are several notable no-code and low-code text analysis tools with user-friendly interfaces, such as MonkeyLearn, which offers fully automated sentiment analysis. Voyant Tools also supports basic content analysis for exploratory tasks. However, more complex functions like stance detection typically require training custom models with labeled data. Unlike LLMs, these tools do not support direct, interactive, open-ended classification; instead, they rely on predefined categories and supervised learning. Costs will vary based on the type of model (open-source or paid services) and the amount of text analyzed.
Environmental impact: LLMs have a significant ecological footprint that should be contemplated and this is often in contradiction to the ethos of social science research for the interest of society at large .

AI in Social Sciences: A Balanced Approach to Responsible Use and Preserving Human Agency

AI is transforming how academic research is being done, whether we like it or not. The use of AI in social sciences must be balanced between benefiting from its impressive capabilities while upholding honest academic practices, meaningful research, and situated human judgment. Researchers should have the freedom to decide if and how to use AI in their work, as long as they respect the principles of academic integrity and follow ethical guidelines for scientific research.

LLMs’ efficiency for text analysis can assist researchers in exploring more questions and investigating topics in greater depth. However, AI-generated output alone is not the end goal for social scientists, but a starting point to explain broader social phenomena. AI's lack of moral agency limits its ability to connect thoroughly and meaningfully with social realities. Consequently, while LLMs can be helpful supporting tools, they should never replace the researcher’s embodied and contextual perspective when conducting critical interpretations or thorough inferences. Ultimately, if AI is to be used, it should complement, not replace, the rich, context-aware, and positional knowledge that is unique to human researchers.

When using AI, scholars must ensure its responsible application under strict human supervision, openly disclose and explain its use, and carefully review and validate its results. Academic institutions and publishers must establish clear AI guidelines. Scientific organizations should promote open dialogue among researchers to discuss AI’s impact on academia and to ensure that these technologies respond to the interests of science and not only to those of tech corporations.

Author: Fátima Ávila Acosta

NEXT READ: AI, Care Work, and the Politics of Value

Articles in the “Ideas from the Palaver Tree” collection were co-edited by Selamawit Engida Abdella and Dr. Fola Adeleke