Statistical Analysis►
  • Descriptive analysis
  • Inferential analysis
  • Predictive analysis
  • Causal analysis
  • Prescriptive analysis
  • Exploratory analysis
Thematic Analysis►
  • Familiarisation with the data
  • Generating initial codes
  • Searching for themes
  • Reviewing themes
  • Defining and naming themes
  • Compiling the findings

The most insightful research analysis is when the statistics are given context and meaning by qualitative insights. We possess a strong mix of qualitative and quantitative skills and will conduct all the relevant statistical and thematic analysis.

Khumbula Consulting

Statistical Analysis

The purpose of statistical analysis is description and inference. We run a range of descriptive statistics to present the basic profiles and patterns in research samples. We also perform advanced statistical tests and modelling to find associations, make predictions and draw inferences.

Thematic Analysis

Qualitative data is analysed using the content analysis and thematic approaches wherein recurring themes in the data are identified, categorised, and explained. Themes are the patterns in the data that describe or are associated with certain phenomena. It is these themes that become the categories for analysis and reporting.

research data analysis

Dissertation Data Analysis

We assist individuals pursuing academic studies with data analysis for their dissertations. We perform both quantitative (statistical) and qualitative (thematic) analysis, compile the results, and provide interpretative notes.

  • Assisted at Bachelors Level 22% 22%
  • Assisted at Masters Level 67% 67%
  • Assisted at PhD Level 11% 11%

Commonly Used Statistical Tests 

  • The Pearson Chi-Square Test of Independence (or Chi-Square Test of Association) determines whether there is an association between categorical variables. In other words, it examines whether the variables are independent or related.
  • The Pearson product moment correlation examines whether there is a linear relationship between two continuous variables and, if so, determines the direction (i.e. negative or positive) and strength of the linear association. The non-parametric equivalent to the Pearson correlation is Spearman correlation.
  • The One Sample T-test determines whether the sample mean is statistically different from a known or hypothesized population mean. The variable used in this test is known as the test variable and is compared against a test or hypothesized value of the mean in the population. One-sample Wilcoxon signed rank test is the equivalent for non-normally distributed data.
  • The Independent Samples T-Test compares means of two independent groups to determine whether the respective population means are significantly different from each other. This test can only compare the means for two groups and where there are more than two groups to compare, one-way-analysis of variance (ANOVA) must be used instead. The non-parametric option for the Independent Samples T-Test is the Mann Whitney U test.
  • One-way analysis of variance (ANOVA) compares the means of two or more independent groups to determine whether there is a statistically significant difference. The ANOVA can still compare means for two groups, in which case it will produce the same results as the Independent Samples T-Test. The Kruskal-Wallis test is used in place of ANOVA for non-normally distributed data.
  • The Paired Samples T-Test compares two means that are from the same individual, object, or related units. The two means can represent a measurement taken at two different time points, such as a pre-test and post-test in an intervention. The means can also be from measurements taken under two different conditions, such as in a design with an experimental versus control condition. The non-parametric equivalent is the Wilcoxon Rank sum test.
  • Linear regression is used to model the relationship between two variables by fitting a linear equation to the data. One of the variables is treated as the explanatory (independent) variable and the other, the outcome (dependent) variable. For example, a linear regression model can be used to relate the gestational age (in weeks) of newly born infants to their birth weight (in grams).
  • Logistic regression is the regression analysis option when the dependent variable is dichotomous or binary. It is used to describe data and to explain the relationship between one dependent binary variable and one or more nominal, ordinal, interval, or ratio independent variables. An example would be predicting the odds ratios or chances of developing lung cancer from smoking.


Grounded Theory Approach to Data Analysis

1) – Open coding involves reading through the data (usually textual scripts) to familiarize with the data. Then one can start creating tentative labels or codes for the data in a way that summarizes what is being observed. Codes are meaningful expressions that describe the concepts in single words or a short sequence of words. The codes are created based on the meaning emerging from the data as opposed to what is known from existing theory.

2) – Axial coding consists of identifying relationships among the codes developed from the open coding process above. The purpose is to relate the data to reveal categories and subcategories and thus to construct linkages between the data. Axial coding takes an inductive approach of developing theory based on observations.

3) – Selective coding refers to the final stage of the data analysis after the core concepts emerging from the coded data categories and subcategories have been identified and named through the open and axial coding steps. This means replacing open coding and focusing coding only to the core variable that the analyst has discovered from the earlier processes. Subsequent coding is therefore limited to what is relevant to the emergent conceptual framework.


research data analysis
data analysis in research

Social Media Analysis


Attituder™ is the name we give to our social media analysis package that helps brands track the volumes of data that is shared by customers across various online platforms. We offer two main types of social media analysis.

1)- Sentiment Analysis: Sentiment refers to the way social media audience thinks and feels towards a brand, product, or service and is deduced from posts and comments on social media platforms. Social media users tend to react to brand posts in a manner that goes beyond the subject of the post itself. This often reveals deeper lying attitudes towards the brand and exploring the nature and extent of these feelings helps rectify perceived brand weaknesses while leveraging on the opportunities and strengths. Sentiment can be positive, neutral, or negative. We thus compute the percentage of the audience falling into each of these categories by examining social media remarks pertaining to the brand. The net sentiment score is calculated by subtracting the negative sentiment (%) from the positive sentiment (%) and, ideally, this net score should not be sub-zero.

2) – Deviation Analysis: To deviate is to do something that is different from the usual or anticipated way of behaving. Social media followers often digress from the specific subject or theme of a post and, in the process, redirect the subsequent thread away from the main communication. For example, a post by a manufacturer on a new product can trigger reactions about previous bad experiences with the manufacturer in question, and thus detract focus from the new offering. Measuring deviation provides some measure of post impact. Given that social media platforms are marketing platforms, if a post on a new product, for example, receives minimal remarks about the offering itself, one can conclude that the post and platform in question are not delivering sufficiently towards the marketing objectives. Ideally, all comments made in reaction to a post should be related to the subject of the post. Looked at it this way, any deviation is undesirable. We thus report the overall proportion of the audience making comments that are unrelated to specific posts. Although deviation is not desirable, not all of it is necessarily deprecating and can indeed be complimentary. We therefore further segment the total deviation into (%) commendation versus (%) deprecation.

data analysis in research

Workshops Discourse Analysis


Business or organisational workshops are a valuable platform for information gathering and sharing. Whether a workshop audience comprises of internal stakeholders (e.g. staff) or external stakeholders (e.g. customers), the value of such sessions is enhanced when the discourse is properly captured, analysed, and packaged. Workshop reports stand a better chance of being read, assimilated, and implemented by key role players and decision makers when they are well-packaged.

Our Workshoper™ package is a discourse processing service wherein we record, transcribe, and conduct thematic analysis of speeches, comments, and other forms of discourse from business workshops, conferences, and other such gatherings. We thus use the term “workshop” broadly to describe different types of organised meetings or gatherings.

This service offering entails a four-phase process of 1)- data collection 2)- data processing 3)- data analysis and 4)- report writing. To reduce costs and/or to fit in with client’s organisational protocol, the data collection phase can be done internally and independently by the workshop organisers and facilitators, with the data sent to us afterwards for processing, analysis, and reporting. The information can be collected in various ways, depending on the nature of the workshop and client preferences. The common data formats that we work with are audio, video, transcripts, notes, papers, and questionnaires.

The final output are informative and analytical reports that go beyond just raw and unstructured verbatim transcripts of the workshop proceedings and discourse.