Integrate teenager protection tools to social media analyzer. #23

GYFX35 · 2025-09-29T07:54:09Z

This commit introduces a new set of tools focused on protecting teenagers from online threats. The new features include detection for:

Cyberbullying
Inappropriate content
Privacy risks (oversharing)

A new module, teen_protection.py, has been added to house the analysis logic. The main application has been updated to include a new menu for these tools. Heuristics have been expanded with relevant keywords and weights.

Unit tests for the new functionality have been added and integrated into the existing test suite, and all tests are passing.

Summary by Sourcery

Integrate new teenager protection tools into the social media analyzer by adding a dedicated module for risk detection, expanding heuristics, updating the CLI menu, and including comprehensive unit tests.

New Features:

Add teen_protection module to detect cyberbullying, inappropriate content, and privacy risks
Integrate teenager protection option into the main CLI menu

Enhancements:

Expand heuristics with keywords and weights for teen risk categories
Adjust test runner and menu indexing to include the new teen protection tools

Tests:

Add unit tests for teen protection analysis covering all risk types

This commit introduces a new set of tools focused on protecting teenagers from online threats. The new features include detection for: - Cyberbullying - Inappropriate content - Privacy risks (oversharing) A new module, `teen_protection.py`, has been added to house the analysis logic. The main application has been updated to include a new menu for these tools. Heuristics have been expanded with relevant keywords and weights. Unit tests for the new functionality have been added and integrated into the existing test suite, and all tests are passing.

sourcery-ai · 2025-09-29T07:54:14Z

Reviewer's Guide

This PR integrates a new teenager protection toolkit into the social media analyzer by introducing a dedicated teen_protection module that applies heuristic keyword detection for cyberbullying, inappropriate content, and privacy risks, updating the CLI to expose these analyses, extending heuristics with new keyword lists and weights, and adding comprehensive unit tests into the existing test suite.

Entity relationship diagram for new heuristic keyword lists and weights

erDiagram
    CYBERBULLYING_KEYWORDS {
        string keyword
    }
    INAPPROPRIATE_CONTENT_KEYWORDS {
        string keyword
    }
    PRIVACY_RISK_KEYWORDS {
        string keyword
    }
    HEURISTIC_WEIGHTS {
        string category
        float weight
    }
    CYBERBULLYING_KEYWORDS ||--o| HEURISTIC_WEIGHTS : "uses category CYBERBULLYING"
    INAPPROPRIATE_CONTENT_KEYWORDS ||--o| HEURISTIC_WEIGHTS : "uses category INAPPROPRIATE_CONTENT"
    PRIVACY_RISK_KEYWORDS ||--o| HEURISTIC_WEIGHTS : "uses category PRIVACY_RISK"

Class diagram for the new teen_protection module

classDiagram
    class TeenProtection {
        +analyze_text_for_teen_risks(text, analysis_type)
        +analyze_for_cyberbullying(text)
        +analyze_for_inappropriate_content(text)
        +analyze_for_privacy_risks(text)
    }
    class Heuristics {
        +CYBERBULLYING_KEYWORDS
        +INAPPROPRIATE_CONTENT_KEYWORDS
        +PRIVACY_RISK_KEYWORDS
        +HEURISTIC_WEIGHTS
    }
    TeenProtection --|> Heuristics : uses

File-Level Changes

Change	Details	Files
Introduce teen protection analysis module	Implement analyze_text_for_teen_risks function with keyword mapping and scoring Provide specific wrappers for cyberbullying, inappropriate content, and privacy risks	`social_media_analyzer/teen_protection.py`
Integrate teen protection into CLI	Import teen_protection in main.py Add analyze_for_teen_risks handler and new menu option Update menu numbering to include teen protection	`social_media_analyzer/main.py`
Extend heuristics with teenager protection keywords and weights	Add keyword lists for cyberbullying, inappropriate content, and privacy risks Define corresponding heuristic weights	`social_media_analyzer/heuristics.py`
Add and integrate unit tests for teenager protection	Create TestTeenProtection suite covering all analysis types Integrate TestTeenProtection into test_runner and adjust imports	`social_media_analyzer/test_teen_protection.py` `social_media_analyzer/test_runner.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

Consider refactoring analyze_for_teen_risks to separate user I/O from the core analysis logic so it can be reused and more easily tested without interactive input calls.
Instead of manually constructing the TestSuite in test_runner, leverage unittest discovery to automatically pick up all test modules and avoid forgetting new tests.
Simple substring matching for heuristics may lead to false positives/negatives—consider normalizing the text and using word-boundary or regex-based checks to improve accuracy.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Consider refactoring analyze_for_teen_risks to separate user I/O from the core analysis logic so it can be reused and more easily tested without interactive input calls.
- Instead of manually constructing the TestSuite in test_runner, leverage unittest discovery to automatically pick up all test modules and avoid forgetting new tests.
- Simple substring matching for heuristics may lead to false positives/negatives—consider normalizing the text and using word-boundary or regex-based checks to improve accuracy.

## Individual Comments

### Comment 1
<location> `social_media_analyzer/main.py:167` </location>
<code_context>
+        print("\n--- Analyzing for Privacy Risks ---")
+        result = teen_protection.analyze_for_privacy_risks(text_to_analyze)
+
+    print(f"Score: {result['score']} (Higher is more suspicious)")
+    if result['indicators_found']:
+        print("Indicators Found:")
</code_context>

<issue_to_address>
**issue (bug_risk):** Accessing 'score' and 'indicators_found' without error handling may cause issues if the result contains an error.

Check for an 'error' key in the result before accessing 'score' or 'indicators_found' to avoid KeyError exceptions.
</issue_to_address>

### Comment 2
<location> `social_media_analyzer/teen_protection.py:33-41` </location>
<code_context>
+    category, keywords = keyword_map[analysis_type]
+    weight = HEURISTIC_WEIGHTS.get(category.upper(), 1.0)
+
+    for keyword in keywords:
+        if keyword in text_lower:
+            message = f"Detected potential {category.replace('_', ' ').lower()} keyword: '{keyword}'"
</code_context>

<issue_to_address>
**suggestion:** Simple substring matching may lead to false positives.

Consider using regular expressions with word boundaries to avoid matching keywords within other words and reduce false positives.

```suggestion
    import re

    category, keywords = keyword_map[analysis_type]
    weight = HEURISTIC_WEIGHTS.get(category.upper(), 1.0)

    for keyword in keywords:
        # Use regex with word boundaries to avoid matching substrings within other words
        pattern = r'\b' + re.escape(keyword) + r'\b'
        if re.search(pattern, text_lower):
            message = f"Detected potential {category.replace('_', ' ').lower()} keyword: '{keyword}'"
            if message not in indicators_found:
                indicators_found.append(message)
                score += weight
```
</issue_to_address>

### Comment 3
<location> `social_media_analyzer/heuristics.py:273-278` </location>
<code_context>
     "SUSPICIOUS_URL_PATTERN": 3.0, # High weight for matching a suspicious URL pattern
     "GOOGLE_SAFE_BROWSING_HIT": 10.0, # Very high weight for a positive Google Safe Browsing match
+    # Teenager Protection Weights
+    "CYBERBULLYING": 2.5,
+    "INAPPROPRIATE_CONTENT": 3.0,
+    "PRIVACY_RISK": 3.5,
 }

</code_context>

<issue_to_address>
**suggestion:** Heuristic weights for teen protection categories may need calibration.

These weights appear arbitrary; please validate them against expert input or data to ensure appropriate risk prioritization.

```suggestion
    "GOOGLE_SAFE_BROWSING_HIT": 10.0, # Very high weight for a positive Google Safe Browsing match
    # Teenager Protection Weights
    # NOTE: The following weights for teen protection categories are provisional.
    # TODO: Validate these weights against expert input or empirical data to ensure appropriate risk prioritization.
    "CYBERBULLYING": 2.5,
    "INAPPROPRIATE_CONTENT": 3.0,
    "PRIVACY_RISK": 3.5,
 }
```
</issue_to_address>

### Comment 4
<location> `social_media_analyzer/test_teen_protection.py:8` </location>
<code_context>
+    analyze_for_privacy_risks
+)
+
+class TestTeenProtection(unittest.TestCase):
+
+    def test_cyberbullying(self):
</code_context>

<issue_to_address>
**suggestion (testing):** Missing tests for edge cases: partial keyword matches and case sensitivity.

Add tests for partial keyword matches and case sensitivity to ensure accurate and robust detection.
</issue_to_address>

### Comment 5
<location> `social_media_analyzer/test_teen_protection.py:10` </location>
<code_context>
+
+class TestTeenProtection(unittest.TestCase):
+
+    def test_cyberbullying(self):
+        """Test the cyberbullying detection."""
+        # Test case with bullying keywords
</code_context>

<issue_to_address>
**suggestion (testing):** No tests for invalid analysis type or error handling.

Add a test to verify that an error dictionary is returned for invalid analysis types, ensuring error handling is covered.

Suggested implementation:

```python
class TestTeenProtection(unittest.TestCase):

    def test_cyberbullying(self):
        """Test the cyberbullying detection."""
        # Test case with bullying keywords
        text1 = "You are such a loser and an idiot."
        result1 = analyze_for_cyberbullying(text1)
        self.assertGreater(result1['score'], 0)
        self.assertIn("Detected potential cyberbullying keyword: 'loser'", result1['indicators_found'])
        self.assertIn("Detected potential cyberbullying keyword: 'idiot'", result1['indicators_found'])

        # Test case with no bullying keywords
        text2 = "Have a great day!"
        result2 = analyze_for_cyberbullying(text2)
        self.assertEqual(result2['score'], 0)

    def test_invalid_analysis_type(self):
        """Test error handling for invalid analysis type."""
        from .teen_protection import analyze_text
        text = "This is a test message."
        result = analyze_text(text, analysis_type="unknown_type")
        self.assertIsInstance(result, dict)
        self.assertIn("error", result)
        self.assertIn("Invalid analysis type", result["error"])

```

If the dispatcher function is not named `analyze_text`, or its signature differs, you should adjust the import and call accordingly. Also, ensure that the error dictionary returned by the function contains an "error" key with a message about the invalid type.
</issue_to_address>

### Comment 6
<location> `social_media_analyzer/test_teen_protection.py:25` </location>
<code_context>
+        self.assertEqual(result2['score'], 0)
+        self.assertEqual(len(result2['indicators_found']), 0)
+
+    def test_inappropriate_content(self):
+        """Test the inappropriate content detection."""
+        # Test case with inappropriate keywords
</code_context>

<issue_to_address>
**suggestion (testing):** Missing tests for multiple keyword occurrences in a single text.

Add a test where a keyword appears multiple times to ensure correct score accumulation and indicator handling.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-09-29T07:55:14Z

social_media_analyzer/heuristics.py

    "GOOGLE_SAFE_BROWSING_HIT": 10.0, # Very high weight for a positive Google Safe Browsing match
+    # Teenager Protection Weights
+    "CYBERBULLYING": 2.5,
+    "INAPPROPRIATE_CONTENT": 3.0,
+    "PRIVACY_RISK": 3.5,
 }


suggestion: Heuristic weights for teen protection categories may need calibration.

These weights appear arbitrary; please validate them against expert input or data to ensure appropriate risk prioritization.

Suggested change

"GOOGLE_SAFE_BROWSING_HIT": 10.0, # Very high weight for a positive Google Safe Browsing match

# Teenager Protection Weights

"CYBERBULLYING": 2.5,

"INAPPROPRIATE_CONTENT": 3.0,

"PRIVACY_RISK": 3.5,

}

"GOOGLE_SAFE_BROWSING_HIT": 10.0, # Very high weight for a positive Google Safe Browsing match

# Teenager Protection Weights

# NOTE: The following weights for teen protection categories are provisional.

# TODO: Validate these weights against expert input or empirical data to ensure appropriate risk prioritization.

"CYBERBULLYING": 2.5,

"INAPPROPRIATE_CONTENT": 3.0,

"PRIVACY_RISK": 3.5,

}

sourcery-ai · 2025-09-29T07:55:14Z

social_media_analyzer/test_teen_protection.py

+        self.assertEqual(result2['score'], 0)
+        self.assertEqual(len(result2['indicators_found']), 0)
+
+    def test_inappropriate_content(self):


suggestion (testing): Missing tests for multiple keyword occurrences in a single text.

Add a test where a keyword appears multiple times to ensure correct score accumulation and indicator handling.

GYFX35 merged commit 4f21e9c into main Sep 29, 2025
0 of 6 checks passed

sourcery-ai bot reviewed Sep 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Integrate teenager protection tools to social media analyzer. #23

Integrate teenager protection tools to social media analyzer. #23

Uh oh!

GYFX35 commented Sep 29, 2025 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot commented Sep 29, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Sep 29, 2025

Uh oh!

sourcery-ai bot Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Integrate teenager protection tools to social media analyzer. #23

Integrate teenager protection tools to social media analyzer. #23

Uh oh!

Conversation

GYFX35 commented Sep 29, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Entity relationship diagram for new heuristic keyword lists and weights

Class diagram for the new teen_protection module

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

GYFX35 commented Sep 29, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Sep 29, 2025 •

edited

Loading