Disclaimer
The Stages of Sustainability content analysis tool assesses documents to determine their alignment with the Stages of Sustainability model. Content analysis provides word counts. This tool automates the use of R, an open-source statistical analysis software program. By conducting the word count in R, we have been able to write code that simultaneously cleans the data and this provides a more accurate word count. Our tool makes the R code publicly available for others to use, particularly for those who do not know the R program. As a researcher, you must still understand when deductive content analysis is an appropriate methodology, it's strengths and limitations, identify the proper documents for analysis, and interpret the output from the R code. Furthermore, our tool does not provide a full statistical analysis, it is the responsibility of the researcher to determine the appropriate statistical tests to answer the research question. This tool is not AI and no data is ever saved to our servers.
Basic Instructions
The output provided will show the extent to which the document aligns with weak or strong sustainability. This analysis is possible and reliable because communications reflect worldviews, mentalities, and actions. The document(s) uploaded to the tool must be converted to .txt format and encoded in UTF-8. Text editing software (e.g. Microsoft Notepad, Microsoft Word, Apple Pages, or OpenOffice Writer) can save documents in this format. If the document is in .pdf or any other format, there are tools to convert the document to a .txt file (https://stirlingpdf.io/?lang=en_US or https://tools.pdf24.org/en/ may be useful).
A single file or multiple files may be uploaded at once. Either drag and drop multiple files to the upload box or browse to the files and hold ctrl on the keyboard while selecting multiple files. Uploaded files are not saved on our servers and are immediately deleted after the analysis is completed.
Output from the Assessment
The output generated from the keyword content analysis is delivered in a .csv file. The data includes:
The name of the document(s) analyzed
The count of the individual keywords related to the Stages of Sustainability Model
A summation of the keyword counts by stage
A calculation of the keyword percentages by stage based on total word count of the document.
If 4 or fewer files are uploaded, a graph of the results will be provided. If more than 4 files are uploaded, the graph will not be provided because the graph will be too crowded.
The .csv file is compatible with modern spreadsheet software (e.g. Excel, OpenOffice, Google Sheets) and statistical software (e.g, R, SAS, SPSS). Statistical analysis, charts, and graphs can be created with this data. Descriptive data such as standard deviation, standard error, or means can be calculated with the data. With the output data, various statistical analysis can be performed. The keyword percentages will be the most important value as these numbers represent the document’s alignment with each of the five stages of sustainability. For example, the keyword percentages can be regressed against other data to determine relationships between strong sustainability and another concept. The keyword percentages can also be assessed by ANOVA to determine trends within the data.
Past Research
Strong Sustainability Trends:
An assessment of the sustainability strength of business education texts (Landrum & Ohsowski, 2017)
An assessment of the sustainability strength of GRI and non-GRI sustainability reports (Landrum & Ohsowski, 2018)
An assessment of the sustainability strength of major sustainability principles, frameworks, guidelines and standards (Demastus & Landrum, 2024)
Relationships Between Non-Sustainability Constructs and Sustainability Stage:
A study of organizational culture type and relation to sustainability strength (Demastus et al., 2025)
Aligning digitalisation and sustainable development? Evidence from the analysis of worldviews in sustainability reports (Niehoff, 2022)
The data will most likely fail statistical assumption tests if there are differences in document type, large differences in document total word count, authorships, writing style, formatting, document structure, and time frame. This is a limitation to the methodology. It’s likely that if several documents are analyzed from different sources, a data transformation (e.g. log10, square root, Box-Cox, reciprocal, arcsine, normalization) will be necessary to pass statistical assumption tests (normality, homoscedasticity, independence, multicollinearity). If assistance is needed with this, please contact us, as we’ll be happy to help with research related to the Stages of Sustainability tool.
Methodological Limitations
The method is sensitive to the data provided, thus the need for data transformation is likely.
The keywords used in the analysis are biased towards environmental sustainability.
Keyword content analysis does not consider context of the document. Nuance, emotion, language style, or emphasis are not considered in this analysis.
Keyword bias is present due to the authors’ subjective interpretation of strong sustainability.
Notes About the Computational Structure of the Tool
The keyword content analysis follows guidelines of Structural Topic Modelling (STM) (Roberts et al., 2014) by applying an unsupervised approach, meaning computer software counts the keywords in the uploaded files. The keyword content analysis applies Fully Automated Clustering (FAC), meaning no algorithms are used to induce data beyond word count (Grimmer and Stewart 2013; Lucas et al. 2015).
References
Demastus, J. & Landrum, N. (2024). Organizational sustainability schemes align with weak sustainability. Business Strategy and the Environment, 33(2): 707-725. https://doi.org/10.1002/bse.3511
Demastus, J., Ohsowski, B., & Landrum, N. (2025). Exploring the nexus of organizational culture and sustainability for green innovation. Industry and Innovation, 32(1): 108-138. https://doi.org/10.1080/13662716.2024.2390991
Grimmer, J. & Stewart, B. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis, 21(3): 267-297. doi:10.1093/pan/mps028
Landrum, N. (2018). Stages of corporate sustainability: Integrating the strong sustainability worldview. Organization & Environment, 31(4): 287-313. https://doi.org/10.1177/1086026617717456
Landrum, N. & Ohsowski, B. (2017). Content trends in sustainable business education: An analysis of introductory courses in the U.S. International Journal of Sustainability in Higher Education, 18(3): 385-414. https://doi.org/10.1108/ijshe-07-2016-0135
Landrum, N. & Ohsowski, B. (2018). Identifying worldviews on corporate sustainability: A content analysis of corporate sustainability reports. Business Strategy and the Environment, 27(1): 128-151. https://doi.org/10.1002/bse.1989
Lucas, C., Nielsen, R., Roberts, M., Steward, B., Storer, A., & Tingley, D. (2015). Computer-assisted text analysis for comparative politics. Political Analysis, 23(2): 254-277. https://doi.org/10.1093/pan/mpu019
Niehoff (2022). Aligning digitalisation and sustainable development? Evidence from the analysis of worldviews in sustainability reports. Business Strategy and the Environment, bse.3043. https://doi.org/10.1002/bse.3043
Roberts, M., Stewart, B., Tingley, D., Lucas, C., Leder-Luis, J., Gadarian, S., Albertson, B., & Rand, D. (2014). Structural topic models for open-ended survey responses. American Journal of Political Science, 58(4): 1064-1082. https://doi.org/10.1111/ajps.12103