top of page

ANALYTICAL CODING USING NVIVO: Qualitative Data Coding by a Team of Interdisciplinary Researchers

New Partnerships for Sustainability (NEPSUS) is a Tanzanian-Danish research project that involves fifteen researchers with different disciplinary backgrounds and expertise. This is a great strength but, at the same time, a challenge. It was therefore not without some concern that I took on the role of organizing a workshop aimed at coding qualitative data with NVivo (specifically NVivo 12). NVivo is a software which supports the organization, coding and analysis of qualitative data. It can be challenging for researchers to analyze qualitative data if they usually work with quantitative data.

Furthermore, NVivo has several features that can lead to a rather quantitative analysis. This is problematic if the qualitative data used has not been collected in a statistically significant manner. A central challenge to the workshop design was therefore to organize an NVivo coding process that ensured an inductive and qualitative approach to the analysis. Having now worked through the process, I would like to share my experiences as they may be useful to other large, interdisciplinary research teams.

NEPSUS does not have the option of using NVivo for Teams because of unreliable internet and cost requirements. If practical, however, NVivo for Teams allows participants to work simultaneously on a single NVivo project file on one server. In NEPSUS, we instead worked asynchronously on individual NVivo project files. We used analytical codes (explained later), not purely thematic codes. Furthermore, we only coded the sentences that we considered relevant to the analysis. This means that although every sentence is of course read as part of the coding process, not necessarily every sentence is coded. In my experience, this saves time as coding thematically, and coding every sentence, can easily transform the analytical process into a mechanical one.

Why Coding?

The project includes various sources of quantitative and qualitative data and it can be difficult for all members of the team to understand and benefit from all the data. Even though it is not an easy task to coordinate a coding process that involves more than ten researchers, it proved significantly useful. After coding, it is much easier for all members of a research group to find and use relevant qualitative data. Another advantage we identified was that the preparation process for coding led to important conversations concerning synergies, analysis, definitions and interpretation of data among the team members.

Step 1: Organize and Clean the Qualitative Data

Before coding, all the qualitative data, including key informant interviews, focus group discussions, participant observations and secondary documents, need to be in their final form and well organized. This means:

  • All the qualitative data is organized in folders in a logical way that is clear to everyone who will be coding.

  • Interviews are transcribed as correctly as possible

  • All files are free of grammatical errors and spelling mistakes

  • All background information is included in the relevant file (such as name of interviewees, date recorded, etc.). NEPSUS has collected data pertaining to three different natural resource sectors, wildlife, coastal resources and forestry. We prepared Excel sheets for each sector – see image below for an example of the information recorded for each instance of participant observation in the forestry sector.

Figure 1: Information recorded for each instance of participant observation (PO) in the forestry sector

Step 2: Organize a Coding Workshop/Retreat

Coding is not just a process of organizing data, it is also an analytical process. This means that the organization of data will depend on the analytical approach that is adopted. In order for the researchers involved in coding to establish and work with a common analytical framework, joint preparation is crucial. We organized a one-week workshop in Bagamoyo, Tanzania, located a few hours’ drive from Dar es Salaam, where most of the researchers live. Since they have many other work obligations, it was important to gather the participants in a location far enough from their workplace so that they would not be disturbed.

Figure 2: Joint preparation throughout the workshop

Step 3: Create a Codebook

The first part of the workshop was concerned with developing a codebook. A codebook is an overview of all the relevant codes and their descriptions. In NVivo, it is possible to work with first-level codes, often referred to as parent nodes. These parent nodes can then contain subthemes that are called child nodes. A parent node includes the information from all the child nodes as well as any information that has been coded to the parent node, but not a child node. We chose to use Excel for our codebook (see image below for an excerpt from the codebook). It is also possible to create a codebook directly in NVivo. In this case, you first create all the parent nodes and child nodes in NVivo and then export the structure as a codebook. However, when discussing and editing the codebook as a group, I considered the Excel format to be easier to use.

Figure 3: Excerpt from codebook

We decided to work only with codes that were central to our analytical foci and agreed on avoiding the creation of purely descriptive codes. The aim was to end up with approximately 50 parent nodes. Even when there is a codebook, it is difficult for the coder to remember that the codes exist if there are too many. In order to select the appropriate codes, we adhered to the following procedures:

Do not code unnecessarily

We did not include codes for information that we did not need to look for in the qualitative data. For example:

  • If the information is common knowledge or already known to the researchers (e.g. the year a partnership was established), we did not include it. However, if the objective of the code is to demonstrate confusion about dates among respondents, it could be coded.

  • If the information is already captured in the quantitative survey, e.g. if the quantitative survey asks whether villagers have experienced an increase in the number of elephants, there is no reason to code for this information in the qualitative data. If, however, the interviewees elaborate on the consequences of this, or add other kinds of information that go beyond the survey, it could be coded.

Analytical codes based on the quantitative survey

  • We went through the preliminary findings of the quantitative survey to determine interesting trends, paradoxes and gaps that the qualitative data could help to answer. We then created codes which could help us to find these answers.

Analytical codes based on cross-cutting issues and emerging hypotheses

  • Having conducted fieldwork multiple times, the research team members had already identified some findings that challenged current understandings in the literature. They had also developed ideas for interesting trends and paradoxes as well as hypotheses that might explain them. For each natural resource sector we worked on selecting key codes based on these hypotheses. The underlying idea was that the codes could help find the data needed in order to support, nuance or challenge the hypotheses. In addition, I prepared codes based on cross-cutting issues identified by the team during a previous debriefing meeting following the fieldwork. Using this approach enabled us to work closely with the data, which is a key factor to successful inductive research.

Analytical codes that arise while coding

  • Once we had begun the coding process, coders suggested nodes that they found were missing in the codebook (discussed later).

The process of selecting approximately 50 codes to create the first version of the codebook took one day. The codes are key to the analysis. Identifying the most appropriate and interesting codes is therefore central to good analytical work and should not be rushed.