# Secure Web Analyzer

A comprehensive web application security and performance analyzer built with Django, Celery, and modern scanning tools.

## Features

- **Performance Analysis**: Uses Google Lighthouse for Core Web Vitals and performance metrics
- **Security Scanning**: Integrates OWASP ZAP for vulnerability detection
- **Browser Analysis**: Playwright-based console error and network analysis
- **Header Security**: Checks HTTP security headers and TLS configuration
- **Async Processing**: Celery workers for background scan processing
- **REST API**: Full API access to all scanning functionality

## Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                        Frontend (Templates)                      │
│              Tailwind CSS + Alpine.js + Chart.js                │
└────────────────────────────┬────────────────────────────────────┘
                             │
┌────────────────────────────▼────────────────────────────────────┐
│                     Django REST Framework                        │
│                    /api/scans, /api/websites                    │
└────────────────────────────┬────────────────────────────────────┘
                             │
          ┌──────────────────┼──────────────────┐
          │                  │                  │
┌─────────▼─────────┐ ┌──────▼──────┐ ┌────────▼────────┐
│   PostgreSQL DB   │ │    Redis    │ │  Celery Worker  │
│  Scans, Issues,   │ │Message Queue│ │  Background     │
│     Metrics       │ │             │ │  Processing     │
└───────────────────┘ └─────────────┘ └────────┬────────┘
                                               │
        ┌──────────────────────────────────────┼──────────────────────┐
        │                                      │                      │
┌───────▼───────┐ ┌─────────────────┐ ┌────────▼────────┐ ┌──────────▼──────────┐
│  Lighthouse   │ │   OWASP ZAP     │ │   Playwright    │ │   Headers Scanner   │
│   (Node.js)   │ │   (Docker)      │ │   (Python)      │ │   (requests/ssl)    │
│   Port 3001   │ │   Port 8081     │ │                 │ │                     │
└───────────────┘ └─────────────────┘ └─────────────────┘ └─────────────────────┘
```

## Quick Start

### Prerequisites

- Docker & Docker Compose
- Git

### 1. Clone and Configure

```bash
git clone <repository-url>
cd secure-web

# Copy environment file
cp backend/.env.example backend/.env

# Edit .env with your settings (optional for development)
```

### 2. Start the Stack

```bash
# Build and start all services
docker-compose up --build -d

# View logs
docker-compose logs -f

# Check service status
docker-compose ps
```

### 3. Initialize Database

```bash
# Run migrations
docker-compose exec web python manage.py migrate

# Create superuser (optional)
docker-compose exec web python manage.py createsuperuser
```

### 4. Access the Application

- **Web Interface**: http://localhost:8000
- **Admin Panel**: http://localhost:8000/admin
- **API Documentation**: http://localhost:8000/api/

## Running a Scan

### Via Web Interface

1. Navigate to http://localhost:8000
2. Enter a URL in the input field (e.g., `https://example.com`)
3. Click "Scan Website"
4. Wait for the scan to complete (typically 1-3 minutes)
5. View results including scores, metrics, and issues

### Via API

```bash
# Create a new scan
curl -X POST http://localhost:8000/api/scans/ \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

# Response:
# {
#   "id": "uuid-here",
#   "url": "https://example.com",
#   "status": "pending",
#   ...
# }

# Check scan status
curl http://localhost:8000/api/scans/{scan-id}/

# List all scans
curl http://localhost:8000/api/scans/

# Get issues for a scan
curl "http://localhost:8000/api/issues/?scan={scan-id}"
```

## API Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/scans/` | List all scans |
| POST | `/api/scans/` | Create new scan |
| GET | `/api/scans/{id}/` | Get scan details |
| GET | `/api/websites/` | List all websites |
| GET | `/api/issues/` | List all issues |
| GET | `/api/issues/?scan={id}` | Issues for specific scan |
| GET | `/api/issues/?severity=high` | Filter by severity |

## Scanner Integration

### Lighthouse (Performance)

The Lighthouse scanner runs as a separate Node.js service and provides:
- **Performance Score**: Overall performance rating
- **Core Web Vitals**: FCP, LCP, CLS, TTI, TBT
- **Resource Analysis**: Unused JS, render-blocking resources
- **Best Practices**: Modern web development compliance

```python
# Internal service call
POST http://lighthouse:3001/scan
{
    "url": "https://example.com",
    "options": {
        "preset": "desktop"
    }
}
```

### OWASP ZAP (Security)

ZAP performs active security scanning:
- **Spider Crawling**: Discovers URLs and entry points
- **Passive Scanning**: Analyzes responses for vulnerabilities
- **Alert Detection**: XSS, injection, misconfigurations

```python
# ZAP API endpoints used
GET http://zap:8081/JSON/spider/action/scan/
GET http://zap:8081/JSON/pscan/view/recordsToScan/
GET http://zap:8081/JSON/core/view/alerts/
```

### Playwright (Browser Analysis)

Playwright performs real browser analysis:
- **Console Errors**: JavaScript errors and warnings
- **Network Metrics**: Response times, failed requests
- **Memory Metrics**: JS heap size monitoring
- **Resource Loading**: Images, scripts, stylesheets

### Headers Scanner (HTTP Security)

Checks security headers and TLS configuration:
- **Security Headers**: CSP, HSTS, X-Frame-Options, etc.
- **Cookie Security**: Secure, HttpOnly, SameSite flags
- **TLS Certificate**: Validity, expiration, issuer
- **Information Disclosure**: Server version headers

## Configuration

### Environment Variables

```bash
# Django
SECRET_KEY=your-secret-key
DEBUG=True
ALLOWED_HOSTS=localhost,127.0.0.1

# Database
DATABASE_URL=postgres://user:pass@db:5432/secure_web

# Redis
REDIS_URL=redis://redis:6379/0
CELERY_BROKER_URL=redis://redis:6379/0

# Scanner Services
LIGHTHOUSE_URL=http://lighthouse:3001
ZAP_API_URL=http://zap:8081
ZAP_API_KEY=changeme

# Scanner Timeouts
LIGHTHOUSE_TIMEOUT=120
ZAP_TIMEOUT=300
PLAYWRIGHT_TIMEOUT=60
```

### Scanner Configuration

Modify `backend/core/settings.py`:

```python
SCANNER_CONFIG = {
    'lighthouse': {
        'url': os.getenv('LIGHTHOUSE_URL', 'http://lighthouse:3001'),
        'timeout': int(os.getenv('LIGHTHOUSE_TIMEOUT', '120')),
        'preset': 'desktop',  # or 'mobile'
    },
    'zap': {
        'url': os.getenv('ZAP_API_URL', 'http://zap:8081'),
        'api_key': os.getenv('ZAP_API_KEY', 'changeme'),
        'timeout': int(os.getenv('ZAP_TIMEOUT', '300')),
        'spider_max_depth': 3,
    },
    'playwright': {
        'timeout': int(os.getenv('PLAYWRIGHT_TIMEOUT', '60')),
        'viewport': {'width': 1920, 'height': 1080},
    },
    'headers': {
        'timeout': 30,
        'verify_ssl': True,
    },
}
```

## Development

### Running Locally (without Docker)

```bash
# Backend setup
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Set environment
export DATABASE_URL=postgres://user:pass@localhost:5432/secure_web
export REDIS_URL=redis://localhost:6379/0

# Run migrations
python manage.py migrate

# Start Django
python manage.py runserver

# Start Celery (separate terminal)
celery -A core worker -l INFO

# Start Celery Beat (separate terminal)
celery -A core beat -l INFO
```

### Running Tests

```bash
# Run all tests
docker-compose exec web pytest

# Run specific test file
docker-compose exec web pytest tests/test_validators.py -v

# Run with coverage
docker-compose exec web pytest --cov=. --cov-report=html

# Local testing
cd backend
pytest tests/ -v
```

### Code Structure

```
secure-web/
├── backend/
│   ├── core/                  # Django project settings
│   │   ├── settings.py
│   │   ├── urls.py
│   │   ├── celery.py
│   │   └── wsgi.py
│   ├── websites/              # Main app - models
│   │   ├── models.py          # Website, Scan, Issue, Metric
│   │   └── admin.py
│   ├── api/                   # DRF API
│   │   ├── views.py
│   │   ├── serializers.py
│   │   └── urls.py
│   ├── scanner/               # Scanner modules
│   │   ├── base.py            # BaseScanner ABC
│   │   ├── validators.py      # URL validation, SSRF protection
│   │   ├── headers_scanner.py
│   │   ├── lighthouse_scanner.py
│   │   ├── playwright_scanner.py
│   │   ├── zap_scanner.py
│   │   ├── runner.py          # Orchestrator
│   │   └── tasks.py           # Celery tasks
│   ├── templates/             # Frontend templates
│   │   ├── base.html
│   │   ├── index.html
│   │   └── scan_detail.html
│   └── tests/                 # Unit tests
│       ├── test_validators.py
│       ├── test_scans.py
│       └── test_scanner_parsing.py
├── lighthouse/                # Lighthouse Node.js service
│   ├── server.js
│   ├── package.json
│   └── Dockerfile
└── docker-compose.yml
```

## Issue Categories

| Category | Source | Description |
|----------|--------|-------------|
| `performance` | Lighthouse | Speed, loading, rendering issues |
| `security` | ZAP, Headers | Vulnerabilities, misconfigurations |
| `accessibility` | Lighthouse | WCAG compliance issues |
| `seo` | Lighthouse | Search optimization issues |
| `best_practices` | Lighthouse | Modern web standards |
| `console_errors` | Playwright | JavaScript runtime errors |
| `network` | Playwright | Failed requests, slow responses |
| `headers` | Headers | Missing security headers |
| `tls` | Headers | Certificate issues |
| `cookies` | Headers | Insecure cookie settings |

## Issue Severities

| Level | Color | Description |
|-------|-------|-------------|
| `critical` | Red | Immediate action required |
| `high` | Orange | Significant security/performance risk |
| `medium` | Yellow | Should be addressed |
| `low` | Blue | Minor improvement |
| `info` | Gray | Informational only |

## Troubleshooting

### Common Issues

**Services not starting:**
```bash
# Check logs
docker-compose logs web
docker-compose logs celery_worker
docker-compose logs lighthouse
docker-compose logs zap

# Restart services
docker-compose restart
```

**Database connection errors:**
```bash
# Wait for DB to be ready
docker-compose exec web python manage.py wait_for_db

# Check DB status
docker-compose exec db psql -U secure_web -c "\l"
```

**ZAP not responding:**
```bash
# ZAP takes time to start, wait 30-60 seconds
docker-compose logs zap

# Check ZAP status
curl http://localhost:8081/JSON/core/view/version/
```

**Scan stuck in pending:**
```bash
# Check Celery worker
docker-compose logs celery_worker

# Restart worker
docker-compose restart celery_worker
```

### Performance Tips

- For production, use a dedicated ZAP instance
- Consider caching Lighthouse results for repeated scans
- Adjust timeouts based on target website complexity
- Use Redis persistence for task queue durability

## Security Considerations

- URL validation includes SSRF protection (blocks private IPs)
- ZAP API key should be changed in production
- Consider rate limiting scan endpoints
- Validate and sanitize all user inputs
- Run containers with minimal privileges

## License

MIT License - See LICENSE file for details.

## Contributing

1. Fork the repository
2. Create a feature branch
3. Write tests for new functionality
4. Submit a pull request

## Support

For issues and feature requests, please use the GitHub issue tracker.