basic functionality
This commit is contained in:
132
README.md
Normal file
132
README.md
Normal file
@@ -0,0 +1,132 @@
|
||||
# Website Change Alert
|
||||
|
||||
A Python tool that monitors websites for content changes and sends email notifications.
|
||||
|
||||
## Features
|
||||
|
||||
- ✅ Monitor any website URL
|
||||
- ✅ Hash-based change detection
|
||||
- ✅ Optional XPath or CSS selector support to monitor specific page sections
|
||||
- ✅ Email notifications via SMTP
|
||||
- ✅ Environment-based configuration
|
||||
- ✅ Designed for cron scheduling
|
||||
|
||||
## Setup
|
||||
|
||||
1. **Install dependencies:**
|
||||
```bash
|
||||
poetry install
|
||||
```
|
||||
|
||||
2. **Configure environment:**
|
||||
```bash
|
||||
cp .env.example .env
|
||||
# Edit .env with your settings
|
||||
```
|
||||
|
||||
3. **Run manually to test:**
|
||||
```bash
|
||||
poetry run python alert.py
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Edit `.env` with your settings:
|
||||
|
||||
### Required Settings
|
||||
|
||||
- `URL`: The website to monitor
|
||||
- `SMTP_HOST`: SMTP server hostname (e.g., `smtp.gmail.com`)
|
||||
- `SMTP_USER`: Your SMTP username
|
||||
- `SMTP_PASSWORD`: Your SMTP password or app-specific password
|
||||
- `FROM_EMAIL`: Sender email address
|
||||
- `TO_EMAIL`: Recipient email address
|
||||
|
||||
### Optional Settings
|
||||
|
||||
- `SELECTOR`: XPath or CSS selector to monitor specific content (leave empty for full page)
|
||||
- `SELECTOR_TYPE`: Type of selector - `xpath` (default) or `css`
|
||||
- `CACHE_FILE`: Location to store hash cache (default: `.cache/hash.txt`)
|
||||
- `SMTP_PORT`: SMTP port (default: `587`)
|
||||
- `SMTP_USE_TLS`: Use TLS encryption (default: `true`)
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Monitor entire page:
|
||||
```env
|
||||
URL=https://example.com/news
|
||||
SELECTOR=
|
||||
```
|
||||
|
||||
### Monitor specific section with XPath:
|
||||
```env
|
||||
URL=https://example.com/news
|
||||
SELECTOR=//div[@class='main-content']
|
||||
SELECTOR_TYPE=xpath
|
||||
```
|
||||
|
||||
### Monitor specific section with CSS selector:
|
||||
```env
|
||||
URL=https://example.com/news
|
||||
SELECTOR=.main-content
|
||||
SELECTOR_TYPE=css
|
||||
```
|
||||
|
||||
## Scheduling with Cron
|
||||
|
||||
Add to your crontab (`crontab -e`):
|
||||
|
||||
```cron
|
||||
# Check every hour
|
||||
0 * * * * cd /path/to/alert-website-change && /path/to/poetry run python alert.py
|
||||
|
||||
# Check every 15 minutes
|
||||
*/15 * * * * cd /path/to/alert-website-change && /path/to/poetry run python alert.py
|
||||
|
||||
# Check daily at 9 AM
|
||||
0 9 * * * cd /path/to/alert-website-change && /path/to/poetry run python alert.py
|
||||
```
|
||||
|
||||
To get the full path to poetry:
|
||||
```bash
|
||||
which poetry
|
||||
```
|
||||
|
||||
## Email Provider Setup
|
||||
|
||||
### Gmail
|
||||
1. Enable 2-factor authentication
|
||||
2. Generate an app-specific password: https://myaccount.google.com/apppasswords
|
||||
3. Use the app password in `SMTP_PASSWORD`
|
||||
|
||||
### Other Providers
|
||||
Common SMTP settings:
|
||||
- **Gmail**: `smtp.gmail.com:587` (TLS)
|
||||
- **Outlook**: `smtp-mail.outlook.com:587` (TLS)
|
||||
- **Yahoo**: `smtp.mail.yahoo.com:587` (TLS)
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **First run**: Fetches the page, computes a hash, and saves it to cache
|
||||
2. **Subsequent runs**:
|
||||
- Fetches the page content
|
||||
- Computes hash of current content
|
||||
- Compares with cached hash
|
||||
- If different: sends email and updates cache
|
||||
- If same: exits silently
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Import errors during development
|
||||
Run `poetry install` to install all dependencies.
|
||||
|
||||
### No email received
|
||||
- Check spam folder
|
||||
- Verify SMTP credentials
|
||||
- Test with a simple manual run
|
||||
- Check cron logs: `grep CRON /var/log/syslog` (Linux) or `log show --predicate 'process == "cron"' --last 1h` (macOS)
|
||||
|
||||
### XPath/CSS selector returns nothing
|
||||
- Test your selector in browser DevTools
|
||||
- Use `//text()` at the end of XPath to get text content
|
||||
- Verify the selector matches elements on the page
|
||||
Reference in New Issue
Block a user