Reddit AI Trend Reports is an open-source Python project that helps you automatically monitor and analyze how the Reddit community talks about artificial intelligence (AI). Point it at the subreddits you care about, and it will fetch relevant posts, summarize them with LLMs, run sentiment analysis, highlight popular keywords, and export clear visualizations. It’s useful for AI researchers, market analysts, and anyone who wants a fast read on what’s trending in AI conversations on Reddit.
Feature Highlights
- Fetch Reddit posts: Pull the latest or top posts from chosen subreddits (e.g.,
r/MachineLearning
,r/StableDiffusion
,r/ChatGPT
). - Keyword filtering: Keep only posts that match your user-defined keyword list.
- Sentiment analysis: Score titles and bodies to gauge positive, negative, or neutral sentiment toward specific AI topics.
- LLM summaries: Generate concise post summaries with Large Language Models.
- Supports OpenAI API (e.g., GPT-3.5 models).
- Supports Hugging Face models for open-source summarization.
- Hot keyword detection: Surface trending terms and themes across collected posts.
- Data visualization: Use the
LIDA
library to automatically create charts (bars, lines, scatter plots, word clouds, and more) for intuitive exploration. - Result export: Save raw data, summaries, and sentiment scores as CSV/JSON; charts are saved as image files.
Getting Started
Reddit AI Trend Reports runs as a Python command-line tool. You’ll need basic familiarity with Python and the terminal.
Step 1: Environment setup & installation
1. Clone the repository
git clone https://github.com/liyedanpdx/reddit-ai-trends.git
cd reddit-ai-trends
2. Create and activate a virtual environment
python -m venv venv
macOS/Linux:
source venv/bin/activate
Windows:
.\venv\Scripts\activate
You should see (venv)
in your prompt after activation.
3. Install dependencies
pip install -r requirements.txt
4. Obtain API credentials
- Reddit API (PRAW):
- Visit the Reddit developer page.
- Click “are you a developer? create an app…”.
- Choose app type
script
. - Enter a name (e.g.,
RedditAITrendsTracker
) and description. - Set
redirect uri
tohttp://localhost:8080
(dummy is fine, but required). - Click “create app”.
- Copy your
client_id
(under the app name) andclient_secret
(next tosecret
). - Prepare a
user_agent
string (e.g.,RedditAITrendsTracker by u/YourRedditUsername
).
- LLM API: The tool supports both OpenAI and Hugging Face.
- OpenAI API Key for GPT-based summarization.
- Hugging Face API Token for models hosted on Hugging Face.
5. Create a .env
file in the project root and add your credentials (keep it private):
# Reddit API
REDDIT_CLIENT_ID='your Reddit Client ID'
REDDIT_CLIENT_SECRET='your Reddit Client Secret'
REDDIT_USER_AGENT='your Reddit User Agent'
REDDIT_USERNAME='your Reddit username'
REDDIT_PASSWORD='your Reddit password' # Optional if you only need read access
# LLM API (use one or both)
OPENAI_API_KEY='your OpenAI API Key'
HUGGINGFACE_API_TOKEN='your Hugging Face API Token'
Step 2: Run the analyzer
1. Basic command
python main.py --subreddits MachineLearning
Fetches a default number of posts from r/MachineLearning
without summaries, sentiment, or charts.
2. Typical workflow
- Collect and filter posts
python main.py --subreddits MachineLearning,StableDiffusion \
--keywords "LLM,GPT-4,Diffusion" --limit 50
Grabs up to 50 posts from r/MachineLearning
and r/StableDiffusion
that contain any of the given keywords.
- Run sentiment analysis
python main.py --subreddits ChatGPT --limit 20 --sentiment_analysis
python main.py --subreddits ChatGPT --limit 20 --sentiment_analysis
Adds sentiment scores for 20 posts from r/ChatGPT
.
- Summarize with an LLM
Enable summaries via --summarize_posts
, pick a backend with --llm_backend
(openai
or huggingface
), and choose a model using --model_name
.
OpenAI example:
python main.py --subreddits MachineLearning --limit 10 \
--summarize_posts --llm_backend openai --model_name gpt-3.5-turbo \
--summary_length 50
Hugging Face example:
python main.py --subreddits StableDiffusion --limit 10 \
--summarize_posts --llm_backend huggingface \
--model_name facebook/bart-large-cnn --summary_length 100
- Auto-generate charts
python main.py --subreddits ChatGPT,MachineLearning --limit 100 \
--visualize_data --output_dir my_results
The command collects data, creates charts with LIDA
, and saves them into my_results
.
- Control output directory
python main.py --subreddits AITech --limit 30 --output_dir AI_Reports \
--summarize_posts --visualize_data
CSV/JSON outputs and chart images will be saved under AI_Reports
(created if missing).
Command-line Arguments
--subreddits
(required): Comma-separated subreddit names.--keywords
(optional): Comma-separated keyword list for filtering.--limit
(optional): Max number of posts to fetch (default:50
).--llm_backend
(optional): Chooseopenai
orhuggingface
when--summarize_posts
is enabled.--model_name
(optional): LLM name (e.g.,gpt-3.5-turbo
,facebook/bart-large-cnn
).--summary_length
(optional): Target summary length in words (default:100
).--output_dir
(optional): Directory for results and charts (default:results
).--sentiment_analysis
(optional): Enable sentiment scoring.--summarize_posts
(optional): Enable post summarization.--visualize_data
(optional): Generate data visualizations.
Mix and match parameters to fit your research or reporting needs.
Use Cases
- AI researchers tracking hot topics: Monitor
r/MachineLearning
orr/ArtificialIntelligence
to spot new findings, notable algorithms, and emerging trends. - Market analysts measuring product sentiment: Watch
r/ChatGPT
or a product-specific community to understand reactions to releases, updates, and competitors. - Content creators hunting for timely ideas: Identify trending themes and keywords to produce AI content that resonates with readers.
- Developers monitoring framework/tool feedback: Track framework/tool subreddits (e.g., TensorFlow, PyTorch, Stable Diffusion) to gather issues, feature requests, and user experiences.
FAQ
1) How do I get Reddit API credentials?
Go to https://www.reddit.com/prefs/apps/
, create a script
app, and fill in the required fields. After creation you’ll see your client_id
and client_secret
, and you should set a clear user_agent
.
2) Why do I need an LLM API key?
The tool uses LLMs to summarize Reddit posts. If you plan to use OpenAI’s GPT models or Hugging Face models, provide the corresponding API Key or Token.
3) Which LLMs are supported for summarization?
With OpenAI, you can use gpt-3.5-turbo
and other supported models. With Hugging Face, select a summarization-ready model such as facebook/bart-large-cnn
, and pass its name via --model_name
.
4) How do I specify multiple subreddits or keywords?
Use comma-separated values without spaces. Examples: --subreddits MachineLearning,ChatGPT
or --keywords "LLM,Diffusion"
.
5) Can beginners use this tool?
It’s a Python CLI, so you’ll need basic Python setup skills, virtual environments, and comfort with command-line arguments. If you’re brand new, take a moment to learn those basics first.