Skip to content

Research questions and feasibility #2

@saurabh-khanna

Description

@saurabh-khanna

1. How can shadowbanning be detected in social media platforms like Instagram?

Develop algorithms to identify patterns in content visibility restrictions, focusing on anomalies in reach, engagement, or ranking compared to similar content or historical trends.

  • Feasibility:
    • Public APIs like Instagram Graph API provide limited data on reach and engagement metrics but may be useful.
    • Python libraries for anomaly detection (e.g., scikit-learn, statsmodels) can help identify unusual patterns.
    • Limitations: APIs often don’t reveal ranking data directly, so indirect methods (e.g., scraping, if allowed) might be needed.

2. What types of content are most affected by shadowbanning or visibility restrictions?

  • Details: Analyze whether certain topics, hashtags, or keywords are disproportionately restricted using text analysis and sentiment analysis.
  • Feasibility:
    • Collect posts and hashtags via Instagram’s API or third-party tools like Tweepy (for X) or PRAW (for Reddit).
    • Use Python NLP libraries like spaCy or NLTK for content analysis.
    • Challenges: Public APIs may not disclose visibility restriction statuses, requiring correlation with engagement trends.

3. Are visibility restrictions applied consistently across different user demographics?

  • Details: Investigate whether user demographics (e.g., account age, follower count, location) influence the likelihood of being shadowbanned.
  • Feasibility:
    • Scrape profile metadata and engagement data (subject to platform rules).
    • Use Python libraries like pandas and matplotlib for data analysis.
    • Challenges: Demographic data is not always available through APIs, potentially requiring indirect inference (e.g., location from bio or posting times).

4. How do visibility restrictions evolve over time on Instagram?

  • Details: Track changes in content reach, engagement, or ranking policies to identify trends or patterns over time.
  • Feasibility:
    • Use archival tools like the Platform Governance Archive or the Wayback Machine.
    • Leverage Python web scraping libraries (BeautifulSoup, Selenium) for historical data collection.
    • Challenges: Historical data gaps for engagement metrics.

5. Are visibility restrictions correlated with specific platform policies (e.g., sensitive content control)?

  • Details: Assess the impact of policy changes on content visibility by comparing before-and-after metrics for affected content.
  • Feasibility:
    • Use APIs or data scraping to gather pre- and post-policy engagement data.
    • Statistical tests (e.g., t-tests, regression) in Python (scipy, statsmodels) can evaluate correlations.
    • Challenges: Policy dates and scope need to be precisely known.

6. Can machine learning predict shadowbanning based on content and metadata?

  • Details: Build predictive models using features like hashtags, sentiment, metadata, and engagement metrics.
  • Feasibility:
    • Train models using Python libraries like scikit-learn or TensorFlow.
    • Data sources: APIs, scraped data, or labeled datasets (if created).
    • Challenges: Labeling shadowbanned content is non-trivial without platform-provided data.

7. Do shadowbanning practices disproportionately affect specific political or cultural content?

  • Details: Evaluate whether political or cultural content faces higher rates of visibility restrictions compared to other content.
  • Feasibility:
    • Use text classification (e.g., Transformers, Hugging Face) to categorize content.
    • Analyze engagement or reach discrepancies using APIs or scraping.
    • Challenges: Requires careful operationalization of “political” or “cultural” content.

8. How transparent are platforms like Instagram in notifying users about visibility restrictions?

  • Details: Analyze whether restricted accounts are notified by comparing the presence of notification data or discrepancies in engagement without explanations.
  • Feasibility:
    • Monitor account activity for any notifications using automated tools (e.g., Selenium).
    • User feedback surveys could complement computational analysis.
    • Challenges: Platform limitations on API access to such notifications.

9. What role do algorithms (e.g., sentiment analysis or keyword matching) play in shadowbanning?

  • Details: Reverse-engineer platform algorithms using adversarial testing or controlled experiments (e.g., posting test content).
  • Feasibility:
    • Post test cases systematically and measure engagement.
    • Use machine learning explainability tools (e.g., SHAP, LIME) to infer algorithmic decisions.
    • Challenges: Ethical and legal considerations in adversarial testing.

10. How does the sensitive content opt-in feature affect content discovery?

  • Details: Investigate engagement differences for content marked as "sensitive" based on user opt-in preferences.
  • Feasibility:
    • Scrape or analyze engagement trends for content flagged under sensitive categories.
    • Use Python libraries like numpy and seaborn for exploratory analysis.
    • Challenges: Access to sensitive-content-specific data may be restricted.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions