A reputation analysis system that processes social media data to provide insights about companies.
- Data Collection: Automated collection of social media posts
- NLP Processing: Advanced natural language processing for sentiment and topic analysis
- Visualization: Interactive visualizations of analysis results
- Storage: Results stored locally in the data directory
- Python 3.8+
- Required Python packages (see requirements.txt)
- Reddit API credentials (for data collection)
- Clone the repository:
git clone https://github.com/yourusername/repusense.git
cd repusense- Install dependencies:
pip install -r requirements.txt- Configure the application:
- Copy
config.example.jsontoconfig.json - Update the configuration with your Reddit API credentials
- Copy
Run the NLP pipeline for a company:
python run_pipeline.py --company "CompanyName" --start-date "2024-01-01" --end-date "2024-12-31"Start the API server:
python -m nlp_pipeline.api.mainThe API will be available at http://localhost:8000
repusense/
├── data/
│ ├── nlp_results/ # Analysis results
│ ├── processed_data/ # Processed data files
│ └── raw_data/ # Raw collected data
├── nlp_pipeline/
│ ├── api/ # API implementation
│ ├── data_processing/ # Data processing modules
│ └── spark_nlp/ # NLP analysis modules
├── config.json # Configuration file
├── requirements.txt # Python dependencies
└── run_pipeline.py # Pipeline execution script
Results are stored in the local filesystem with the following structure:
data/nlp_results/
company_name/
topics/
topic_distribution.json
topic_visualization.html
sentiment/
sentiment_results.json
keywords/
keyword_results.json
word_cloud_data.json
wordcloud.png
engagement/
engagement_results.json
GET /api/companies- List available companiesGET /api/company/{company_name}- Get company informationGET /api/company/{company_name}/topics- Get topic analysisGET /api/company/{company_name}/sentiment- Get sentiment analysisGET /api/company/{company_name}/keywords- Get keyword analysisGET /api/company/{company_name}/engagement- Get engagement analysisGET /api/company/{company_name}/wordcloud- Get word cloud dataGET /api/company/{company_name}/wordcloud-image- Get word cloud imageGET /api/company/{company_name}/topics/visualization-html- Get topic visualization
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.