Artificial Intelligence in Qualitative Research: Opportunities, Limitations, and Ethical Issues
Introduction
In recent years, artificial intelligence has been developing rapidly, and alongside this progress, the scale of its integration into academic research is expanding continuously. Artificial intelligence tools offer researchers entirely new opportunities for data collection, organization, and interpretation. The proliferation of widely accessible platforms such as OpenAI's ChatGPT, Google's Gemini, and Anthropic's Claude, alongside research-specific software like ATLAS.ti or NVivo, has significantly lowered the technical barriers to AI adoption within research communities. According to a 2023 Nature survey, approximately 80% of researchers utilized artificial intelligence-based tools during at least one stage of their research process. The 2025 analysis showed that the use of Large Language Models (LLMs) increased dramatically in survey-based academic research (across a sample of 189 studies), rising from 1.6% in 2022 to 59% in 2024. At the same time, the pace of AI advancement has dramatically accelerated. This rapid development has made artificial intelligence analytically potent and economically accessible on scales that researchers could not even anticipate a few years ago. Such swift evolution demands real-time engagement from the academic community and the pursuit of adequate solutions to emerging challenges. However, this is a quite significant challenge because, alongside the progression of AI, current academic assessments of its impact on various fields may become irrelevant within a short period. Nevertheless, this document attempts to synthesize the most current knowledge to identify the primary effects that artificial intelligence has exerted on qualitative research up to this point.
Defining the essence of qualitative research and the debates on AI integration
Qualitative research encompasses a broad family of methodological traditions, including grounded theory, phenomenology, ethnography, discourse analysis, narrative research, and case studies, all united by a commitment to deeply understanding meanings, experiences, and social phenomena from the participants’ perspectives. The epistemological foundations of qualitative research prioritize interpretive depth, contextual sensitivity, researcher reflexivity, and the co-construction of knowledge between researchers and participants. In this tradition, validity is understood not as statistical reliability, but rather as an in-depth understanding, credibility, transferability, and confirmability of the matter. It is in this context where the issue of AI integration becomes a matter of debate. Proponents of its integration into qualitative research argue that AI tools can handle the procedural demands of qualitative research - transcribing interviews, organizing large textual datasets, performing preliminary coding, and identifying potential themes - thereby freeing researchers from routine yet scrupulous work and allowing them to focus on deeper, interpretive tasks. On the other hand, some argue that the defining characteristics of quality in qualitative research - nuanced sensitivity to language, positionality, implicit meaning, and iterative, reflexive engagement with data - are precisely the features that most resist computational calculations, meaning they require nothing less than human intellectual effort.
The most approved applications of AI in qualitative research today
Automated transcription
Transcribing audio and video recordings into text is one of the most labor-intensive tasks in qualitative research. Artificial intelligence-based Automated Speech Recognition (ASR) systems, including Otter.ai, Whisper (OpenAI), Sonix, and Microsoft Azure Speech Services, have significantly reduced the time and cost required to transcribe in-depth interviews or focus groups. This leaves more time for the researcher to engage in analytical activities. Modern ASR systems built on deep neural networks demonstrate high levels of accuracy when high-quality recordings with standard dialects are used. A good example to understand the efficiency of automated transcription for qualitative research is that while manual transcription of a 60-minute interview used to take approximately 6 to 8 hours, using ChatGPT, for example, completes this process in seconds. ChatGPT is also effective for cleaning raw text derived from audio recordings and eliminating grammatical or stylistic errors. However, it is notable that challenges still persist in the automated transcription process. Specifically, the efficiency of ASR decreases in cases of low-quality audio recordings, multiple speakers talking simultaneously, pronounced accents, non-standard dialects, and specialized terminology. Furthermore, the accuracy of automated transcripts heavily depends on the language. According to various studies, English-language interviews show the best results. Therefore, it can be assumed that automated transcription is still in its developmental stage, and researchers must exercise caution as models are prone to “ennobling” the text, which alters the primary form of the data. It is also noteworthy that transcribing interviews in qualitative research is not merely a mechanical process. Here, close attention must be paid to pauses, tone of speech, non-verbal sounds, and other elements that are largely objects of analytical observation and require human engagement. Thus, when discussing automated transcription, academic literature emphasizes that AI-generated transcripts or quotes require constant oversight and verification by the researcher.
Thematic analysis
One of the most time-consuming methods in qualitative research is thematic analysis. According to various studies, working on thematic analysis is not only time-demanding but also carries risks of cognitive fatigue and subjectivity for the researcher. Studies reveal a radical difference in efficiency: while manual processing of six interviews takes approximately 38 hours, the same work is completed in 2.5 hours using AI-based platforms (e.g., Avidnote). This allows the researcher to dedicate time to more in-depth interpretation of the data. Large Language Model-Assisted Thematic Analysis (LATA) involves AI engagement at all stages of the process. The first stage is data familiarization and coding: ChatGPT and similar models effectively clean transcripts and generate initial codes. Interestingly, artificial intelligence shows better results when using a deductive approach, where a theoretical framework is provided to it in advance. In the process of thematic analysis, AI also assists the researcher in finding connections between codes and building hierarchical structures. Models such as GPT-4 are capable of identifying high-quality, meaningful, and contextually appropriate themes that often align with the insights of experienced researchers.
Despite its efficiency, the literature highlights several critical challenges. One of these is the issue of validity. Specifically, only a small portion of AI-generated codes (approximately 27%) matches manual human analysis, indicating that AI cannot yet perceive complex metaphors and cultural nuances. Another challenge is data distortion - there is a risk of so-called “hallucinations”, where the model creates convincing but non-existent quotes. Additionally, a so-called “politeness effect” is observed, where the AI “cleans up” the respondent’s language during transcription by removing jargon and emotional context, thereby erasing their authentic emotions.
The use of artificial intelligence in qualitative research requires a new competency from the researcher, specifically “AI-reflexivity”. This involves a critical understanding of how the algorithm influenced the formation of themes. From an ethical standpoint, the primary task is data de-identification to ensure that sensitive information does not enter the databases of artificial intelligence platforms.
Based on the reviewed literature, a consensus emerges that artificial intelligence should not be used as an independent analyst. The best results are achieved through a hybrid approach, in which it performs the “technical” task of data organization and initial coding, while the main role remains with the human researcher, who ultimately interprets the data and gives substantive depth to the research.
Systematic literature review and evidence synthesis
The growth in the volume of academic literature has become a serious barrier for contemporary researchers, as manual screening of thousands of articles demands immense time and resources. In response to this challenge, artificial intelligence is increasingly used in systematic and large-scale literature reviews. Tools such as Rayyan, Covidence, Elicit, and Research Rabbit utilize machine learning algorithms to screen titles and abstracts, as well as to extract core data from various articles. Unlike traditional searches based solely on keywords, AI tools employ contextual and semantic analysis, which sharply increases the accuracy of discovering relevant material. Artificial intelligence-based algorithms can rapidly scan the abstracts of thousands of articles and remove duplicates, automatically performing the initial screening based on inclusion-exclusion criteria established by the researcher.
A 2025 study in “JMIR Formative Research” examined artificial intelligence models, including ChatGPT-3.5, GPT-4, Claude 3.5, Claude 3 Opus, and Sonar Huge, for structures to assess the quality of qualitative health research. The study showed that human-AI collaborative workflows, which leverage the efficiency of AI while preserving human expertise and interpretations, represent an effective and responsible way to utilize AI in research.
Despite its high efficiency, the literature critically evaluates the dangers of relying entirely on artificial intelligence. The primary challenge remains the opacity of algorithms, known in scientific discourse as the “black-box” effect; it is often unknown to the researcher what logic the system used to prioritize or exclude certain evidence, which contradicts the principles of reproducibility and transparency in systematic reviews.
Furthermore, there is a risk of algorithmic bias, where AI might miss nuanced qualitative context or prioritize only dominant scientific viewpoints. Because of this, the development of a new type of competency (AI literacy) is on the agenda of academic circles today so that researchers can critically evaluate the validity of these tools.
Artificial Intelligence as a human-assisting rather than human-replacing tool: a summary assessment
At all the aforementioned stages of qualitative research, representatives of the academic space unanimously identify artificial intelligence as the most suitable assisting tool. This consensus is well-reflected in the distinction between “cognitive augmentation” (the enhancement of human capabilities by AI) and “cognitive substitution” (the replacement of human judgment by AI). According to Cook et al. (2025), AI tools showed significant efficiency in summarizing texts and counting keywords, but have not yet achieved satisfactory results in thematic analysis, theme discovery, or generating cross-thematic analysis. Within the framework of analyzing 21 studies, De Souza Santos et al. (2025) reached a similar conclusion - artificial intelligence is promising for supporting qualitative analysis, but human expertise remains of decisive importance for data interpretation.
Perspectives of leading universities and research institutes, publisher policies
Since 2023, leading research universities have shifted from reactive to proactive stances regarding the use of artificial intelligence in research. The dominant institutional position supports the use of AI, provided there are clear requirements for human oversight, transparency, and critical evaluation. In an analysis published in New Directions for Adult and Continuing Education, Azevedo et al. found that AI policies in higher education are grounded in the principles of academic integrity, transparency, and equity. These policies demand that AI tools do not conflict with pedagogical or research objectives and obligate researchers to openly disclose their use of AI.
Harvard University
The Faculty of Arts and Sciences and the Harvard Office of Research have published guidelines on the use of artificial intelligence in research. Harvard recognizes the transformative potential of AI tools for research productivity but requires researchers to take responsibility for disclosing all results obtained with the assistance of AI. Harvard's guidelines specifically address qualitative research contexts, mandating that AI-generated codes, themes, or any other outputs be treated as preliminary results that require human interpretive engagement. Furthermore, this process must be explicitly described in the research methodology sections. Harvard University's Institutional Review Board (IRB) has also updated its protocols to account for the use of AI in analyzing data related to research participants/respondents. This requires an assessment of whether AI tools create privacy risks or analytical biases that could harm the interests of the participants.
University of Oxford
In November 2024, the University of Oxford published a new ethical framework in Nature Machine Intelligence, outlining three essential criteria for the responsible use of LLMs in academic research: human verification to guarantee accuracy and academic integrity; ensuring a meaningful human contribution; and the proper disclosure of LLM usage. Oxford’s research ethics guidelines require the explicit disclosure of AI tools used at any stage of the research process. Concurrently, the university has assembled interdisciplinary working groups to develop discipline-specific guidelines for qualitative research in the social sciences.
Massachusetts Institute of Technology (MIT)
MIT's approach reflects its quantitative, computationally driven orientation in research, yet it also maintains a firm stance on the application of AI in qualitative research. The 2023 report “Responsible AI for Science” by MIT’s Schwarzman College of Computing emphasizes the importance of preserving human interpretive authority in humanities and social science research over AI-generated outputs. MIT’s IRB guidelines require detailed documentation of AI tool usage in all federally funded research involving human subjects data.
University of Melbourne and the Australian Research Context
The University of Melbourne’s 2024 guide, “Artificial Intelligence in Research”, explicitly states that AI tools should not be used to replace a researcher’s judgment in interpretive analyses and that any use of AI in qualitative data analysis must be transparently reflected and methodologically justified. The Australian Research Council (ARC, 2024) has updated its grant application requirements to include mandatory disclosure of AI usage within research methodologies.
Publisher Policies
An analysis conducted in 2025 of the AI policies of the top 10 academic journal publishers revealed that all publishers agree that AI tools cannot be credited as authors. Moreover, the core themes across all publisher guidelines center on author accountability, human oversight, transparency in AI disclosure, and concerns regarding bias, quality, and the protection of intellectual property rights. The Chicago Manual of Style (18th edition, 2024) and the Modern Language Association have updated their guidelines to include requirements for citing and critically evaluating AI tools. This reflects a growing recognition that the use of AI in research demands standardized reporting practices, analogous to those regulating other methodological tools.
Ethical considerations
Transparency and accountability
The issue of transparency in AI-assisted qualitative research is both procedural and epistemological. Following a review of 130 articles (64 of which were published in 2023–2024), Cook et al. (2025) discovered that reporting standards for AI usage in qualitative data analysis across published studies are inconsistent and insufficient. In a paper published in the journal “AI and Ethics”, Resnik and Hosseini (2024/2025) propose nine recommendations for the responsible use of artificial intelligence in scientific research. The first two relate directly to qualitative practices: researchers are responsible for identifying, describing, mitigating, and controlling AI-related biases and accidental errors; and researchers must disclose, describe, and explain their method of using AI in language accessible to non-experts. These recommendations are increasingly being adopted by academic journals.
At the epistemological level, transparency is complicated by the opacity of modern LLMs. Unlike rule-based Computer-Assisted Qualitative Data Analysis Software (CAQDAS) tools, an underlying logic of which can, in principle, be fully audited, LLMs operate through billions of parameters whose individual contributions cannot be explained. However, the principle of explainability is central to the EU Artificial Intelligence Act and is particularly demanded within the context of qualitative research.
Bias and reliability
The literature concerning bias in artificial intelligence systems is extensive and increasingly relevant. LLMs are trained on massive, predominantly English-language, Western-centric corpora that reflect the demographics, cultural assumptions, and historical power dynamics of their primary data sources. Chatzichristos (2025) argues that structural inequalities in AI access also risk reinforcing existing academic disparities, as well-resourced institutions are better positioned to implement and critically evaluate AI tools.
The issue of reliability further complicates LLM challenges. The fundamental non-reproducibility of AI outputs remains unresolved as a core ethical issue, creating accountability gaps that are particularly problematic in qualitative research, where analytical processes serve as the primary object of methodological scrutiny.
Authorship and intellectual contribution
Major academic publishers, including Elsevier, Springer Nature, Taylor & Francis, and Sage, updated their policies between 2023 and 2025 to state that AI systems cannot be listed as authors. This is driven by the logic that authorship inherently entails accountability, which artificial intelligence lacks. A 2025 policy analysis showed that author accountability, human oversight, and transparency are the three most consistently emphasized themes in publishers’ AI policies. All publishers agree that AI cannot be considered an author and that ultimate responsibility for the accuracy and originality of research rests with human authors. However, these policies fail to resolve the question of intellectual contribution when AI systems generate a significant portion of the analytical content. When an AI system produces the codes and themes that form the core findings of a study, and the human researcher’s contribution is largely limited to oversight and editing, it remains debatable whether the epistemological foundation of qualitative inquiry is truly preserved.
Data privacy and consent
The use of commercial AI platforms to process qualitative research data raises acute privacy concerns. Qualitative data, particularly interview transcripts, frequently contain sensitive, identifiable information about participants. When researchers upload such data to cloud-based AI platforms, they may violate the terms of participants’ informed consent, and the data may be utilized for training language models. The EU Artificial Intelligence Act establishes risk-based classifications for AI systems, which directly impact research applications involving personal data.
Azevedo et al. (2025) note that AI policies in higher education are under growing pressure to address privacy risks, including data scraping and surveillance, which many current frameworks overlook by focusing primarily on instructional use. This gap between privacy principles and practice is especially critical in qualitative research, where data sensitivity is high and the vulnerability of participants is often pronounced.
AI in research: research transformation or augmentation?
A vital question in contemporary academia is whether artificial intelligence is transforming qualitative research (by altering its fundamental epistemological commitments) or simply augmenting it by enhancing procedural efficiency while leaving core interpretive practices intact. The evidence reviewed in this document supports the qualified augmentation thesis. At the procedural level, AI is clearly altering qualitative research practices: AI-driven interview transcription, literature screening, and AI-enhanced CAQDAS features are widespread and growing in capability. At the epistemological level, however, claims of transformation remain unsupportable. According to the most comprehensive recent empirical assessment by Cook et al. (2025), AI consistently struggled with tasks that form the epistemological core of qualitative research: thematic analysis, generating cross-thematic insights, and contextual interpretation. De Souza Santos et al. (2025), after synthesizing 21 studies, concluded that human expertise remains decisive for data interpretation. The researchers’ ISERN 2025 workshop also reached the same conclusion independently (Couto Teixeira et al., 2025). Chatzichristos (2025) adds a provocative dimension to this debate: the question is not only what AI can or cannot do, but how its adoption impacts research cultures. If the direction of interaction between AI and qualitative research is driven by structural forces - such as resource allocation, academic productivity pressures, and institutional incentives - then the augmentation/transformation distinction may itself be insufficient. AI may neither change nor merely supplement qualitative methodology; rather, it may gradually transform the institutional conditions under which qualitative research is conducted.
Tensions between efficiency and epistemological depth
In the context of qualitative research, a perpetual tension exists between efficiency and epistemological depth. While efficiency is clearly enhanced by AI tools, this shift can negatively impact the depth of the research. The rapid pace of AI development makes it technically feasible to conduct large-scale qualitative analyses within compressed timeframes. However, it simultaneously makes it harder to maintain the slow, iterative, and reflexive engagement with data that ultimately yields true interpretive insight. Duke (2025) warns that normalizing AI-driven analytical tools at a discipline-wide level carries the risk of degrading the execution quality of qualitative research.
Conclusion
Based on the publications reviewed above, it is possible to identify the most noteworthy contemporary challenges regarding the integration of artificial intelligence into qualitative research. First, AI tools clearly enhance the procedural efficiency of qualitative research - particularly in terms of transcription, data management, structured coding, and literature screening. These benefits are substantial, especially considering that the costs of accessing AI have dropped sharply since 2022. Second, the epistemological core of qualitative research - reflexive, theory-informed interpretation that focuses on context, meaning, and researcher positionality - remains a distinctly human capability.
Third, and as a primary conclusion, institutional and ethical frameworks regarding AI usage are evolving rapidly. These principles largely align around transparency, the honest disclosure of AI use, and human accountability. Nonetheless, a significant gap remains between normative frameworks and published research (actual practice), which demands appropriate attention from journal editorial boards.
For the complete document, including relevant sources, links, and explanations, please see the attached file.