Skip to content

Open Access Toolkit for Automated Content Analysis

License

Notifications You must be signed in to change notification settings

CeMSc/CtrlFindr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

CtrlFindr

CtrlFindr is a natural language processing and content analysis toolkit designed to help you analyze and extract insights from text documents. The toolkit is at the moment implemented in a Jupyter Notebook, providing you with an interactive environment to work with the code and visualize the results. The current version works with English text, but it can easily be adapted for other languages.

In the coming months, I plan to provide a version of the code with a User Interface (UI) and an executable file for easier use.

Features

  • Text preprocessing (lowercase conversion, sentence splitting),
  • Keyword analysis,
  • Co-occurrence analysis,
  • Sentiment analysis (using VADER SentimentIntensityAnalyzer from nltk),
  • Customizable search strings for automating content analysis of natural language (to be filled in the provided Assessment_framework.ots template)
  • Export results to TSV files (number of positive findings per document, positive findings per document as booleans, positive findings as percentage of sentences within documents, and sentiment analysis of positive findings)

Dependencies

Python 3.6 or higher
Pandas
Numpy
NLTK

Installation

Clone the CtrlFindr repository to your local machine:
git clone https://github.com/username/CtrlFindr.git

Navigate to the CtrlFindr directory:

cd CtrlFindr

Fill the search strings and taxonomy in the Assessment_framework.ots

Open the CtrlFindr.ipynb file in Jupyter Notebook to start analyzing your text files.

Usage

  • Place the text files you want to analyze in the TXT folder or adjust the file paths in the txt_to_dataframe() function.

  • Prepare the Assessment_framework.ots file containing the variables, search strings, co-occurrences, document conditionals, and taxonomy as specified in the create_dataframes() function.

  • Customize the code to meet your specific requirements (e.g., utilize optional functions).

  • Execute the cells in the Jupyter Notebook in the order they appear.

  • The results will be saved as TSV files .

  • Check the Jupiter Notebook Example folder for a sample analysis of few text files.

License

This project is licensed under the GNU General Public License v3.0. Please read the LICENSE file for more information.

How to cite

Scartozzi, Cesare M. (2023.) CtrlFindr (Version no.). Available from https://github.com/CeMSc/CtrlFindr.

About

Open Access Toolkit for Automated Content Analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published