Reddit AI Trend Reports is an open-source Python project that helps you automatically monitor and analyze how the Reddit community talks about artificial intelligence (AI). Point it at the subreddits you care about, and it will fetch relevant posts, summarize them with LLMs, run sentiment analysis, highlight popular keywords, and export clear visualizations. It’s useful for AI researchers, market analysts, and anyone who wants a fast read on what’s trending in AI conversations on Reddit.


Feature Highlights

  • Fetch Reddit posts: Pull the latest or top posts from chosen subreddits (e.g., r/MachineLearningr/StableDiffusionr/ChatGPT).
  • Keyword filtering: Keep only posts that match your user-defined keyword list.
  • Sentiment analysis: Score titles and bodies to gauge positive, negative, or neutral sentiment toward specific AI topics.
  • LLM summaries: Generate concise post summaries with Large Language Models.
    • Supports OpenAI API (e.g., GPT-3.5 models).
    • Supports Hugging Face models for open-source summarization.
  • Hot keyword detection: Surface trending terms and themes across collected posts.
  • Data visualization: Use the LIDA library to automatically create charts (bars, lines, scatter plots, word clouds, and more) for intuitive exploration.
  • Result export: Save raw data, summaries, and sentiment scores as CSV/JSON; charts are saved as image files.

Getting Started

Reddit AI Trend Reports runs as a Python command-line tool. You’ll need basic familiarity with Python and the terminal.

Step 1: Environment setup & installation

1. Clone the repository

git clone https://github.com/liyedanpdx/reddit-ai-trends.git
cd reddit-ai-trends

2. Create and activate a virtual environment

python -m venv venv

macOS/Linux:

source venv/bin/activate

Windows:

.\venv\Scripts\activate

You should see (venv) in your prompt after activation.

3. Install dependencies

pip install -r requirements.txt

4. Obtain API credentials

  • Reddit API (PRAW):
    1. Visit the Reddit developer page.
    2. Click “are you a developer? create an app…”.
    3. Choose app type script.
    4. Enter a name (e.g., RedditAITrendsTracker) and description.
    5. Set redirect uri to http://localhost:8080 (dummy is fine, but required).
    6. Click “create app”.
    7. Copy your client_id (under the app name) and client_secret (next to secret).
    8. Prepare a user_agent string (e.g., RedditAITrendsTracker by u/YourRedditUsername).
  • LLM API: The tool supports both OpenAI and Hugging Face.
    • OpenAI API Key for GPT-based summarization.
    • Hugging Face API Token for models hosted on Hugging Face.

5. Create a .env file in the project root and add your credentials (keep it private):

# Reddit API
REDDIT_CLIENT_ID='your Reddit Client ID'
REDDIT_CLIENT_SECRET='your Reddit Client Secret'
REDDIT_USER_AGENT='your Reddit User Agent'
REDDIT_USERNAME='your Reddit username'
REDDIT_PASSWORD='your Reddit password'  # Optional if you only need read access

# LLM API (use one or both)
OPENAI_API_KEY='your OpenAI API Key'
HUGGINGFACE_API_TOKEN='your Hugging Face API Token'

Step 2: Run the analyzer

1. Basic command

python main.py --subreddits MachineLearning

Fetches a default number of posts from r/MachineLearning without summaries, sentiment, or charts.

2. Typical workflow

  • Collect and filter posts
python main.py --subreddits MachineLearning,StableDiffusion \
  --keywords "LLM,GPT-4,Diffusion" --limit 50

Grabs up to 50 posts from r/MachineLearning and r/StableDiffusion that contain any of the given keywords.

  • Run sentiment analysis

python main.py --subreddits ChatGPT --limit 20 --sentiment_analysis

python main.py --subreddits ChatGPT --limit 20 --sentiment_analysis

Adds sentiment scores for 20 posts from r/ChatGPT.

  • Summarize with an LLM

Enable summaries via --summarize_posts, pick a backend with --llm_backend (openai or huggingface), and choose a model using --model_name.

OpenAI example:

python main.py --subreddits MachineLearning --limit 10 \
  --summarize_posts --llm_backend openai --model_name gpt-3.5-turbo \
  --summary_length 50

Hugging Face example:

python main.py --subreddits StableDiffusion --limit 10 \
  --summarize_posts --llm_backend huggingface \
  --model_name facebook/bart-large-cnn --summary_length 100
  • Auto-generate charts
python main.py --subreddits ChatGPT,MachineLearning --limit 100 \
  --visualize_data --output_dir my_results

The command collects data, creates charts with LIDA, and saves them into my_results.

  • Control output directory
python main.py --subreddits AITech --limit 30 --output_dir AI_Reports \
  --summarize_posts --visualize_data

CSV/JSON outputs and chart images will be saved under AI_Reports (created if missing).


Command-line Arguments

  • --subreddits (required): Comma-separated subreddit names.
  • --keywords (optional): Comma-separated keyword list for filtering.
  • --limit (optional): Max number of posts to fetch (default: 50).
  • --llm_backend (optional): Choose openai or huggingface when --summarize_posts is enabled.
  • --model_name (optional): LLM name (e.g., gpt-3.5-turbofacebook/bart-large-cnn).
  • --summary_length (optional): Target summary length in words (default: 100).
  • --output_dir (optional): Directory for results and charts (default: results).
  • --sentiment_analysis (optional): Enable sentiment scoring.
  • --summarize_posts (optional): Enable post summarization.
  • --visualize_data (optional): Generate data visualizations.

Mix and match parameters to fit your research or reporting needs.


Use Cases

  1. AI researchers tracking hot topics: Monitor r/MachineLearning or r/ArtificialIntelligence to spot new findings, notable algorithms, and emerging trends.
  2. Market analysts measuring product sentiment: Watch r/ChatGPT or a product-specific community to understand reactions to releases, updates, and competitors.
  3. Content creators hunting for timely ideas: Identify trending themes and keywords to produce AI content that resonates with readers.
  4. Developers monitoring framework/tool feedback: Track framework/tool subreddits (e.g., TensorFlow, PyTorch, Stable Diffusion) to gather issues, feature requests, and user experiences.

FAQ

1) How do I get Reddit API credentials?
Go to https://www.reddit.com/prefs/apps/, create a script app, and fill in the required fields. After creation you’ll see your client_id and client_secret, and you should set a clear user_agent.

2) Why do I need an LLM API key?
The tool uses LLMs to summarize Reddit posts. If you plan to use OpenAI’s GPT models or Hugging Face models, provide the corresponding API Key or Token.

3) Which LLMs are supported for summarization?
With OpenAI, you can use gpt-3.5-turbo and other supported models. With Hugging Face, select a summarization-ready model such as facebook/bart-large-cnn, and pass its name via --model_name.

4) How do I specify multiple subreddits or keywords?
Use comma-separated values without spaces. Examples: --subreddits MachineLearning,ChatGPT or --keywords "LLM,Diffusion".

5) Can beginners use this tool?
It’s a Python CLI, so you’ll need basic Python setup skills, virtual environments, and comfort with command-line arguments. If you’re brand new, take a moment to learn those basics first.

You may also like

Subscribe
Notify of
guest

0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments