Files
alert-website-change/README.md

133 lines
3.2 KiB
Markdown
Raw Normal View History

2026-02-02 21:45:41 +01:00
# Website Change Alert
A Python tool that monitors websites for content changes and sends email notifications.
## Features
- ✅ Monitor any website URL
- ✅ Hash-based change detection
- ✅ Optional XPath or CSS selector support to monitor specific page sections
- ✅ Email notifications via SMTP
- ✅ Environment-based configuration
- ✅ Designed for cron scheduling
## Setup
1. **Install dependencies:**
```bash
poetry install
```
2. **Configure environment:**
```bash
cp .env.example .env
# Edit .env with your settings
```
3. **Run manually to test:**
```bash
poetry run python alert.py
```
## Configuration
Edit `.env` with your settings:
### Required Settings
- `URL`: The website to monitor
- `SMTP_HOST`: SMTP server hostname (e.g., `smtp.gmail.com`)
- `SMTP_USER`: Your SMTP username
- `SMTP_PASSWORD`: Your SMTP password or app-specific password
- `FROM_EMAIL`: Sender email address
- `TO_EMAIL`: Recipient email address
### Optional Settings
- `SELECTOR`: XPath or CSS selector to monitor specific content (leave empty for full page)
- `SELECTOR_TYPE`: Type of selector - `xpath` (default) or `css`
- `CACHE_FILE`: Location to store hash cache (default: `.cache/hash.txt`)
- `SMTP_PORT`: SMTP port (default: `587`)
- `SMTP_USE_TLS`: Use TLS encryption (default: `true`)
## Usage Examples
### Monitor entire page:
```env
URL=https://example.com/news
SELECTOR=
```
### Monitor specific section with XPath:
```env
URL=https://example.com/news
SELECTOR=//div[@class='main-content']
SELECTOR_TYPE=xpath
```
### Monitor specific section with CSS selector:
```env
URL=https://example.com/news
SELECTOR=.main-content
SELECTOR_TYPE=css
```
## Scheduling with Cron
Add to your crontab (`crontab -e`):
```cron
# Check every hour
0 * * * * cd /path/to/alert-website-change && /path/to/poetry run python alert.py
# Check every 15 minutes
*/15 * * * * cd /path/to/alert-website-change && /path/to/poetry run python alert.py
# Check daily at 9 AM
0 9 * * * cd /path/to/alert-website-change && /path/to/poetry run python alert.py
```
To get the full path to poetry:
```bash
which poetry
```
## Email Provider Setup
### Gmail
1. Enable 2-factor authentication
2. Generate an app-specific password: https://myaccount.google.com/apppasswords
3. Use the app password in `SMTP_PASSWORD`
### Other Providers
Common SMTP settings:
- **Gmail**: `smtp.gmail.com:587` (TLS)
- **Outlook**: `smtp-mail.outlook.com:587` (TLS)
- **Yahoo**: `smtp.mail.yahoo.com:587` (TLS)
## How It Works
1. **First run**: Fetches the page, computes a hash, and saves it to cache
2. **Subsequent runs**:
- Fetches the page content
- Computes hash of current content
- Compares with cached hash
- If different: sends email and updates cache
- If same: exits silently
## Troubleshooting
### Import errors during development
Run `poetry install` to install all dependencies.
### No email received
- Check spam folder
- Verify SMTP credentials
- Test with a simple manual run
- Check cron logs: `grep CRON /var/log/syslog` (Linux) or `log show --predicate 'process == "cron"' --last 1h` (macOS)
### XPath/CSS selector returns nothing
- Test your selector in browser DevTools
- Use `//text()` at the end of XPath to get text content
- Verify the selector matches elements on the page