-
-
Notifications
You must be signed in to change notification settings - Fork 1
feat: Integrate Google Safe Browsing API #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit integrates the Google Safe Browsing API to enhance the scam detection capabilities of the social media analyzer. The key changes include: - A new function `check_google_safe_browsing` is added to `scam_detector.py` to check URLs against the Google Safe Browsing API. - The `is_url_suspicious` function is updated to use the new Google Safe Browsing check. - The main application now retrieves the `GOOGLE_API_KEY` from environment variables and passes it to the analysis functions. - A new heuristic weight `GOOGLE_SAFE_BROWSING_HIT` is added to give a high score to URLs flagged by the API. - A `requirements.txt` file is added for the `social_media_analyzer` project with the `requests` dependency. - Unit tests with mocking are added to `test_runner.py` to verify the integration.
Reviewer's GuideThis PR integrates the Google Safe Browsing API into the scam detection flow by adding a dedicated check function, extending URL analysis logic to leverage real-time threat data, propagating the API key through the main application, updating heuristics for flagged URLs, adding the required HTTP library, and covering the new logic with unit tests. File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Vulnerable Libraries (1)
More info on how to fix Vulnerable Libraries in Python. 👉 Go to the dashboard for detailed results. 📥 Happy? Share your feedback with us. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Consider adding caching or batching for Safe Browsing API results to prevent performance bottlenecks and API rate limit exhaustion when checking multiple URLs.
- The test_runner module mixes manual example runs with the unit test suite; consider splitting demonstration code into a separate script to keep the unit tests clean and focused.
- Pin the requests dependency to a specific version or version range in requirements.txt to avoid unexpected breaking changes in future releases.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider adding caching or batching for Safe Browsing API results to prevent performance bottlenecks and API rate limit exhaustion when checking multiple URLs.
- The test_runner module mixes manual example runs with the unit test suite; consider splitting demonstration code into a separate script to keep the unit tests clean and focused.
- Pin the requests dependency to a specific version or version range in requirements.txt to avoid unexpected breaking changes in future releases.
## Individual Comments
### Comment 1
<location> `social_media_analyzer/scam_detector.py:43-44` </location>
<code_context>
try:
choice = int(input("Enter your choice (1-4): "))
if choice == 1:
</code_context>
<issue_to_address>
**suggestion:** Consider handling non-JSON responses from Google Safe Browsing API.
Catching ValueError alongside RequestException will ensure the code handles unexpected response formats without crashing.
</issue_to_address>
### Comment 2
<location> `social_media_analyzer/scam_detector.py:155-156` </location>
<code_context>
if is_susp:
- score += HEURISTIC_WEIGHTS.get("SUSPICIOUS_URL_PATTERN", 3.0)
+ # Increase score significantly if flagged by Google
+ if "Google Safe Browsing" in reason:
+ score += HEURISTIC_WEIGHTS.get("GOOGLE_SAFE_BROWSING_HIT", 10.0)
+ else:
+ score += HEURISTIC_WEIGHTS.get("SUSPICIOUS_URL_PATTERN", 3.0)
</code_context>
<issue_to_address>
**suggestion (bug_risk):** String matching for Google Safe Browsing reason may be brittle.
Using substring matching for 'Google Safe Browsing' may lead to errors if the message format changes. It's better to have is_url_suspicious return a dedicated flag for Safe Browsing hits.
Suggested implementation:
```python
for url_str in found_urls:
is_susp, reason, is_google_safe_browsing = is_url_suspicious(url_str, platform, api_key)
url_analysis = {"url": url_str, "is_suspicious": is_susp, "reason": reason}
if is_susp:
# Increase score significantly if flagged by Google Safe Browsing
if is_google_safe_browsing:
score += HEURISTIC_WEIGHTS.get("GOOGLE_SAFE_BROWSING_HIT", 10.0)
else:
score += HEURISTIC_WEIGHTS.get("SUSPICIOUS_URL_PATTERN", 3.0)
indicators_found.append(f"Suspicious URL found: {url_str} (Reason: {reason})")
urls_analyzed_details.append(url_analysis)
```
You must also update the `is_url_suspicious` function definition and all its call sites to return a third value: `is_google_safe_browsing` (a boolean). This flag should be set to True if the URL was flagged by Google Safe Browsing, and False otherwise.
</issue_to_address>
### Comment 3
<location> `social_media_analyzer/test_runner.py:54-63` </location>
<code_context>
+class TestScamDetector(unittest.TestCase):
</code_context>
<issue_to_address>
**suggestion (testing):** Missing unit tests for error conditions and edge cases in Google Safe Browsing integration.
Please add tests for missing API key, non-200 status codes, request exceptions, and malformed responses to improve error handling coverage.
Suggested implementation:
```python
class TestScamDetector(unittest.TestCase):
@patch('social_media_analyzer.scam_detector.requests.post')
def test_google_safe_browsing_malicious(self, mock_post):
# Mock the API response for a malicious URL
mock_response = Mock()
mock_response.status_code = 200
mock_response.json.return_value = {
"matches": [
{
"threatType": "MALWARE",
"platformType": "ANY_PLATFORM",
}
]
}
mock_post.return_value = mock_response
# Call the function under test (assume check_url_with_google_safe_browsing exists)
from social_media_analyzer.scam_detector import check_url_with_google_safe_browsing
result = check_url_with_google_safe_browsing("http://malicious.com", api_key="fake-key")
self.assertTrue(result['is_suspicious'])
self.assertIn("MALWARE", result['reason'])
def test_google_safe_browsing_missing_api_key(self):
from social_media_analyzer.scam_detector import check_url_with_google_safe_browsing
# Call with missing API key
result = check_url_with_google_safe_browsing("http://example.com", api_key=None)
self.assertFalse(result['is_suspicious'])
self.assertIn("Missing Google Safe Browsing API key", result['reason'])
@patch('social_media_analyzer.scam_detector.requests.post')
def test_google_safe_browsing_non_200_status(self, mock_post):
mock_response = Mock()
mock_response.status_code = 500
mock_response.json.return_value = {}
mock_post.return_value = mock_response
from social_media_analyzer.scam_detector import check_url_with_google_safe_browsing
result = check_url_with_google_safe_browsing("http://example.com", api_key="fake-key")
self.assertFalse(result['is_suspicious'])
self.assertIn("Google Safe Browsing API error", result['reason'])
@patch('social_media_analyzer.scam_detector.requests.post')
def test_google_safe_browsing_request_exception(self, mock_post):
mock_post.side_effect = Exception("Network error")
from social_media_analyzer.scam_detector import check_url_with_google_safe_browsing
result = check_url_with_google_safe_browsing("http://example.com", api_key="fake-key")
self.assertFalse(result['is_suspicious'])
self.assertIn("Exception during Google Safe Browsing check", result['reason'])
@patch('social_media_analyzer.scam_detector.requests.post')
def test_google_safe_browsing_malformed_response(self, mock_post):
mock_response = Mock()
mock_response.status_code = 200
mock_response.json.side_effect = ValueError("Malformed JSON")
mock_post.return_value = mock_response
from social_media_analyzer.scam_detector import check_url_with_google_safe_browsing
result = check_url_with_google_safe_browsing("http://example.com", api_key="fake-key")
self.assertFalse(result['is_suspicious'])
self.assertIn("Malformed response from Google Safe Browsing", result['reason'])
```
These tests assume that your `check_url_with_google_safe_browsing` function in `scam_detector`:
- Returns a dict with keys `is_suspicious` (bool) and `reason` (str)
- Handles missing API key, non-200 status, exceptions, and malformed JSON as described
If your function does not currently handle these cases, you will need to update its implementation to do so.
</issue_to_address>
### Comment 4
<location> `social_media_analyzer/test_runner.py:77-91` </location>
<code_context>
+ self.assertTrue(any("Google Safe Browsing" in reason for reason in result["indicators_found"]))
+ self.assertEqual(result['urls_analyzed'][0]['is_suspicious'], True)
+
+ @patch('social_media_analyzer.scam_detector.requests.post')
+ def test_google_safe_browsing_clean(self, mock_post):
+ # Mock the API response for a clean URL
+ mock_response = Mock()
+ mock_response.status_code = 200
+ mock_response.json.return_value = {}
+ mock_post.return_value = mock_response
+
+ message = "this is a clean site http://www.google.com"
+ result = analyze_text_for_scams(message, api_key="fake_key")
+
+ self.assertFalse(any("Google Safe Browsing" in reason for reason in result["indicators_found"]))
+ self.assertEqual(result['urls_analyzed'][0]['is_suspicious'], False)
+
+if __name__ == '__main__':
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding test coverage for multiple URLs in a single message.
Please add a test with a message containing multiple URLs, both malicious and clean, to ensure all are correctly analyzed and flagged.
```suggestion
@patch('social_media_analyzer.scam_detector.requests.post')
def test_google_safe_browsing_clean(self, mock_post):
# Mock the API response for a clean URL
mock_response = Mock()
mock_response.status_code = 200
mock_response.json.return_value = {}
mock_post.return_value = mock_response
message = "this is a clean site http://www.google.com"
result = analyze_text_for_scams(message, api_key="fake_key")
self.assertFalse(any("Google Safe Browsing" in reason for reason in result["indicators_found"]))
self.assertEqual(result['urls_analyzed'][0]['is_suspicious'], False)
@patch('social_media_analyzer.scam_detector.requests.post')
def test_multiple_urls_mixed_suspicion(self, mock_post):
# Prepare mock responses for two URLs: one malicious, one clean
def side_effect(url, *args, **kwargs):
mock_resp = Mock()
mock_resp.status_code = 200
if "malicious.com" in kwargs['json']['threatInfo']['threatEntries'][0]['url']:
mock_resp.json.return_value = {
"matches": [
{
"threatType": "MALWARE",
"platformType": "ANY_PLATFORM",
"threat": {
"url": "http://malicious.com"
}
}
]
}
else:
mock_resp.json.return_value = {}
return mock_resp
mock_post.side_effect = side_effect
message = "Check these: http://malicious.com and http://www.google.com"
result = analyze_text_for_scams(message, api_key="fake_key")
urls = {url_info['url']: url_info for url_info in result['urls_analyzed']}
self.assertIn("http://malicious.com", urls)
self.assertIn("http://www.google.com", urls)
self.assertTrue(urls["http://malicious.com"]['is_suspicious'])
self.assertFalse(urls["http://www.google.com"]['is_suspicious'])
self.assertTrue(any("Google Safe Browsing" in reason for reason in urls["http://malicious.com"]['reason']))
self.assertFalse(any("Google Safe Browsing" in reason for reason in urls["http://www.google.com"]['reason']))
if __name__ == '__main__':
```
</issue_to_address>
### Comment 5
<location> `social_media_analyzer/main.py:66` </location>
<code_context>
def analyze_social_media(api_key):
"""Handles the analysis of social media platforms."""
platforms = sorted([
"facebook", "instagram", "whatsapp", "tiktok", "tinder", "snapchat",
"wechat", "telegram", "twitter", "pinterest", "linkedin", "line",
"discord", "teams", "zoom", "amazon", "alibaba", "youtube", "skype",
"vk", "reddit", "email", "viber", "signal", "badoo", "binance",
"sharechat", "messenger", "qzone", "qq", "vimeo", "musical.ly", "kuaishou", "douyin"
])
while True:
print("\nSelect the social media platform you want to analyze:")
for i, p in enumerate(platforms, 1):
print(f"{i}. {p.capitalize()}")
try:
choice = int(input(f"Enter your choice (1-{len(platforms)}): "))
if 1 <= choice <= len(platforms):
platform = platforms[choice - 1]
break
else:
print("Invalid choice. Please try again.")
except ValueError:
print("Invalid input. Please enter a number.")
while True:
print(f"\nWhat do you want to do for {platform.capitalize()}?")
print("1. Analyze a profile for signs of being fake.")
print("2. Analyze a profile for identity usurpation.")
print("3. Analyze a message for phishing or scam attempts.")
try:
analysis_choice = int(input("Enter your choice (1-3): "))
if analysis_choice == 1:
profile_url = input(f"Enter the {platform.capitalize()} profile URL to analyze: ").strip()
if profile_url:
fake_profile_detector.analyze_profile_based_on_user_input(profile_url, platform)
else:
print("No profile URL entered.")
break
elif analysis_choice == 2:
profile_url = input(f"Enter the {platform.capitalize()} profile URL to analyze for impersonation: ").strip()
if profile_url:
fake_profile_detector.analyze_identity_usurpation(profile_url, platform)
else:
print("No profile URL entered.")
break
elif analysis_choice == 3:
message = input("Paste the message you want to analyze: ").strip()
if message:
result = scam_detector.analyze_text_for_scams(message, platform, api_key=api_key)
print("\n--- Scam Analysis Results ---")
print(f"Score: {result['score']} (Higher is more suspicious)")
print("Indicators Found:")
if result['indicators_found']:
for indicator in result['indicators_found']:
print(f"- {indicator}")
else:
print("No specific scam indicators were found.")
else:
print("No message entered.")
break
else:
print("Invalid choice. Please try again.")
except ValueError:
print("Invalid input. Please enter a number.")
</code_context>
<issue_to_address>
**issue (code-quality):** We've found these issues:
- Use named expression to simplify assignment and conditional [×3] ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- Low code quality found in analyze\_social\_media - 21% ([`low-code-quality`](https://docs.sourcery.ai/Reference/Default-Rules/comments/low-code-quality/))
<br/><details><summary>Explanation</summary>
The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.
How can you solve this?
It might be worth refactoring this function to make it shorter and more readable.
- Reduce the function length by extracting pieces of functionality out into
their own functions. This is the most important thing you can do - ideally a
function should be less than 10 lines.
- Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
sits together within the function rather than being scattered.</details>
</issue_to_address>
### Comment 6
<location> `social_media_analyzer/main.py:139` </location>
<code_context>
def main():
"""Main function to run the security analyzer."""
api_key = get_api_key()
print("--- Universal Security Analyzer ---")
print("This tool helps you analyze social media, messages, and websites for potential scams and fake news.")
if not api_key:
print("\n[!] Google Safe Browsing API key not found.")
print(" To enable real-time URL checking against Google's threat database,")
print(" please set the GOOGLE_API_KEY environment variable.")
while True:
print("\n--- Main Menu ---")
print("1. Analyze a Social Media Platform")
print("2. Analyze a Website URL for Scams")
print("3. Analyze a News URL for Fake News")
print("4. Exit")
try:
choice = int(input("Enter your choice (1-4): "))
if choice == 1:
analyze_social_media(api_key)
elif choice == 2:
analyze_website_url(api_key)
elif choice == 3:
analyze_news_url()
elif choice == 4:
print("Exiting. Stay safe!")
break
else:
print("Invalid choice. Please try again.")
except ValueError:
print("Invalid input. Please enter a number.")
</code_context>
<issue_to_address>
**issue (code-quality):** We've found these issues:
- Extract duplicate code into function ([`extract-duplicate-method`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/extract-duplicate-method/))
- Simplify conditional into switch-like form ([`switch`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/switch/))
</issue_to_address>
### Comment 7
<location> `social_media_analyzer/scam_detector.py:45-53` </location>
<code_context>
def check_google_safe_browsing(url, api_key):
"""
Checks a URL against the Google Safe Browsing API.
Returns a tuple: (is_suspicious, reason)
"""
if not api_key:
return False, "Google Safe Browsing API key not configured."
api_url = f"https://safebrowsing.googleapis.com/v4/threatMatches:find?key={api_key}"
payload = {
"client": {
"clientId": "social-media-analyzer",
"clientVersion": "1.0.0"
},
"threatInfo": {
"threatTypes": ["MALWARE", "SOCIAL_ENGINEERING", "UNWANTED_SOFTWARE", "POTENTIALLY_HARMFUL_APPLICATION"],
"platformTypes": ["ANY_PLATFORM"],
"threatEntryTypes": ["URL"],
"threatEntries": [{"url": url}]
}
}
try:
response = requests.post(api_url, json=payload, timeout=10)
if response.status_code == 200:
data = response.json()
if "matches" in data:
threat_type = data["matches"][0]["threatType"]
return True, f"Flagged by Google Safe Browsing as {threat_type}."
else:
return False, "Clean according to Google Safe Browsing."
else:
return False, f"Google Safe Browsing API error: {response.status_code}"
except requests.RequestException as e:
return False, f"Could not connect to Google Safe Browsing: {e}"
</code_context>
<issue_to_address>
**issue (code-quality):** We've found these issues:
- Swap if/else branches ([`swap-if-else-branches`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/swap-if-else-branches/))
- Remove unnecessary else after guard condition ([`remove-unnecessary-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-unnecessary-else/))
</issue_to_address>
### Comment 8
<location> `social_media_analyzer/scam_detector.py:77` </location>
<code_context>
def is_url_suspicious(url, platform=None, api_key=None):
"""
Checks if a URL is suspicious based on various patterns and lists,
including Google Safe Browsing.
Returns a tuple: (bool_is_suspicious, reason_string)
"""
# 1. Google Safe Browsing Check
if api_key:
is_susp, reason = check_google_safe_browsing(url, api_key)
if is_susp:
return True, reason
# 2. Local Heuristics
normalized_url = url.lower()
domain = get_domain_from_url(url)
legitimate_domains = get_legitimate_domains(platform)
# Check if the domain is in the legitimate list for the platform
if domain in legitimate_domains:
# Still check for impersonation patterns that might include the legit domain
for pattern in SUSPICIOUS_URL_PATTERNS:
if re.search(pattern, normalized_url, re.IGNORECASE):
if not domain.endswith(tuple(legitimate_domains)):
return True, f"URL impersonates a legitimate domain: {pattern}"
return False, "URL domain is on the legitimate list."
# Check against known suspicious patterns
for pattern in SUSPICIOUS_URL_PATTERNS:
if re.search(pattern, normalized_url, re.IGNORECASE):
return True, f"URL matches suspicious pattern: {pattern}"
# Check for suspicious TLDs
suspicious_tld_regex = re.compile(r"\.(" + "|".join(tld.lstrip('.') for tld in SUSPICIOUS_TLDS) + r")$", re.IGNORECASE)
if suspicious_tld_regex.search(domain):
return True, f"URL uses a potentially suspicious TLD."
# Check if a known legitimate service name is part of the domain, but it's not official
for service in LEGITIMATE_DOMAINS.keys():
if service != "general" and service in domain:
return True, f"URL contains the name of a legitimate service ('{service}') but is not an official domain."
return False, "URL does not match common suspicious patterns."
</code_context>
<issue_to_address>
**issue (code-quality):** We've found these issues:
- Use the built-in function `next` instead of a for-loop ([`use-next`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-next/))
- Replace f-string with no interpolated values with string ([`remove-redundant-fstring`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-redundant-fstring/))
</issue_to_address>
### Comment 9
<location> `social_media_analyzer/scam_detector.py:118` </location>
<code_context>
def analyze_text_for_scams(text_content, platform=None, api_key=None):
"""
Analyzes a block of text content for various scam indicators.
"""
if not text_content:
return {"score": 0.0, "indicators_found": [], "urls_analyzed": []}
text_lower = text_content.lower()
score = 0.0
indicators_found = []
urls_analyzed_details = []
# 1. Keyword-based checks
keyword_checks = {
"URGENCY": URGENCY_KEYWORDS,
"SENSITIVE_INFO": SENSITIVE_INFO_KEYWORDS,
"TOO_GOOD_TO_BE_TRUE": TOO_GOOD_TO_BE_TRUE_KEYWORDS,
"GENERIC_GREETING": GENERIC_GREETINGS,
"TECH_SUPPORT": TECH_SUPPORT_SCAM_KEYWORDS,
"PAYMENT_REQUEST": PAYMENT_KEYWORDS,
}
for category, keywords in keyword_checks.items():
for keyword in keywords:
if keyword in text_lower:
message = f"Presence of '{category.replace('_', ' ').title()}' keyword: '{keyword}'"
if message not in indicators_found:
indicators_found.append(message)
score += HEURISTIC_WEIGHTS.get(category, 1.0)
# 2. Regex-based checks
found_urls = URL_PATTERN.findall(text_content)
for url_str in found_urls:
is_susp, reason = is_url_suspicious(url_str, platform, api_key)
url_analysis = {"url": url_str, "is_suspicious": is_susp, "reason": reason}
if is_susp:
# Increase score significantly if flagged by Google
if "Google Safe Browsing" in reason:
score += HEURISTIC_WEIGHTS.get("GOOGLE_SAFE_BROWSING_HIT", 10.0)
else:
score += HEURISTIC_WEIGHTS.get("SUSPICIOUS_URL_PATTERN", 3.0)
indicators_found.append(f"Suspicious URL found: {url_str} (Reason: {reason})")
urls_analyzed_details.append(url_analysis)
# 3. Financial Identifiers
for id_name, pattern in FINANCIAL_ADDRESS_PATTERNS.items():
if pattern.search(text_content):
message = f"Potential {id_name} identifier found."
if message not in indicators_found:
indicators_found.append(message)
score += HEURISTIC_WEIGHTS.get(f"{id_name}_ADDRESS", 2.5)
# 4. Phone Numbers
if PHONE_NUMBER_PATTERN.search(text_content):
message = "Phone number detected in text."
if message not in indicators_found:
indicators_found.append(message)
score += HEURISTIC_WEIGHTS.get("PHONE_NUMBER_UNSOLICITED", 1.0)
return {
"score": round(score, 2),
"indicators_found": indicators_found,
"urls_analyzed": urls_analyzed_details
}
</code_context>
<issue_to_address>
**issue (code-quality):** Low code quality found in analyze\_text\_for\_scams - 25% ([`low-code-quality`](https://docs.sourcery.ai/Reference/Default-Rules/comments/low-code-quality/))
<br/><details><summary>Explanation</summary>The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.
How can you solve this?
It might be worth refactoring this function to make it shorter and more readable.
- Reduce the function length by extracting pieces of functionality out into
their own functions. This is the most important thing you can do - ideally a
function should be less than 10 lines.
- Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
sits together within the function rather than being scattered.</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| print(f"- {indicator}") | ||
|
|
||
| def analyze_social_media(): | ||
| def analyze_social_media(api_key): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): We've found these issues:
- Use named expression to simplify assignment and conditional [×3] (
use-named-expression) - Low code quality found in analyze_social_media - 21% (
low-code-quality)
Explanation
The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.
How can you solve this?
It might be worth refactoring this function to make it shorter and more readable.
- Reduce the function length by extracting pieces of functionality out into
their own functions. This is the most important thing you can do - ideally a
function should be less than 10 lines. - Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
sits together within the function rather than being scattered.
| return False, "URL does not match common suspicious patterns." | ||
|
|
||
| def analyze_text_for_scams(text_content, platform=None): | ||
| def analyze_text_for_scams(text_content, platform=None, api_key=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Low code quality found in analyze_text_for_scams - 25% (low-code-quality)
Explanation
The quality score for this function is below the quality threshold of 25%.This score is a combination of the method length, cognitive complexity and working memory.
How can you solve this?
It might be worth refactoring this function to make it shorter and more readable.
- Reduce the function length by extracting pieces of functionality out into
their own functions. This is the most important thing you can do - ideally a
function should be less than 10 lines. - Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
sits together within the function rather than being scattered.
This commit integrates the Google Safe Browsing API to enhance the scam detection capabilities of the social media analyzer.
The key changes include:
check_google_safe_browsingis added toscam_detector.pyto check URLs against the Google Safe Browsing API.is_url_suspiciousfunction is updated to use the new Google Safe Browsing check.GOOGLE_API_KEYfrom environment variables and passes it to the analysis functions.GOOGLE_SAFE_BROWSING_HITis added to give a high score to URLs flagged by the API.requirements.txtfile is added for thesocial_media_analyzerproject with therequestsdependency.test_runner.pyto verify the integration.Summary by Sourcery
Integrate Google Safe Browsing API for real-time URL threat detection in the scam detector, update CLI to handle API key, adjust threat scoring, and add verification tests.
New Features:
Enhancements:
Build:
Tests: