RepuSense

A reputation analysis system that processes social media data to provide insights about companies.

Features

Data Collection: Automated collection of social media posts
NLP Processing: Advanced natural language processing for sentiment and topic analysis
Visualization: Interactive visualizations of analysis results
Storage: Results stored locally in the data directory

Requirements

Python 3.8+
Required Python packages (see requirements.txt)
Reddit API credentials (for data collection)

Installation

Clone the repository:

git clone https://github.com/yourusername/repusense.git
cd repusense

Install dependencies:

pip install -r requirements.txt

Configure the application:
- Copy config.example.json to config.json
- Update the configuration with your Reddit API credentials

Usage

Running the Pipeline

Run the NLP pipeline for a company:

python run_pipeline.py --company "CompanyName" --start-date "2024-01-01" --end-date "2024-12-31"

Using the API

Start the API server:

python -m nlp_pipeline.api.main

The API will be available at http://localhost:8000

Project Structure

repusense/
├── data/
│   ├── nlp_results/     # Analysis results
│   ├── processed_data/  # Processed data files
│   └── raw_data/        # Raw collected data
├── nlp_pipeline/
│   ├── api/            # API implementation
│   ├── data_processing/ # Data processing modules
│   └── spark_nlp/      # NLP analysis modules
├── config.json         # Configuration file
├── requirements.txt    # Python dependencies
└── run_pipeline.py     # Pipeline execution script

Data Storage

Results are stored in the local filesystem with the following structure:

data/nlp_results/
  company_name/
    topics/
      topic_distribution.json
      topic_visualization.html
    sentiment/
      sentiment_results.json
    keywords/
      keyword_results.json
      word_cloud_data.json
      wordcloud.png
    engagement/
      engagement_results.json

API Endpoints

GET /api/companies - List available companies
GET /api/company/{company_name} - Get company information
GET /api/company/{company_name}/topics - Get topic analysis
GET /api/company/{company_name}/sentiment - Get sentiment analysis
GET /api/company/{company_name}/keywords - Get keyword analysis
GET /api/company/{company_name}/engagement - Get engagement analysis
GET /api/company/{company_name}/wordcloud - Get word cloud data
GET /api/company/{company_name}/wordcloud-image - Get word cloud image
GET /api/company/{company_name}/topics/visualization-html - Get topic visualization

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
airflow		airflow
dashboard		dashboard
data		data
nlp_pipeline		nlp_pipeline
scrapping script		scrapping script
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Readme.md		Readme.md
RepuSense_Project_Report.md		RepuSense_Project_Report.md
RepuSense_Project_Report.tex		RepuSense_Project_Report.tex
demo.png		demo.png
docker-compose.yml		docker-compose.yml
report.md		report.md
requirements.txt		requirements.txt
run_pipeline.py		run_pipeline.py
schema.png		schema.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RepuSense

Features

Requirements

Installation

Usage

Running the Pipeline

Using the API

Project Structure

Data Storage

API Endpoints

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

mouadenna/RepuSense

Folders and files

Latest commit

History

Repository files navigation

RepuSense

Features

Requirements

Installation

Usage

Running the Pipeline

Using the API

Project Structure

Data Storage

API Endpoints

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages