13 KiB

Raw Permalink Blame History

Secure Web Analyzer

A comprehensive web application security and performance analyzer built with Django, Celery, and modern scanning tools.

Features

Performance Analysis: Uses Google Lighthouse for Core Web Vitals and performance metrics
Security Scanning: Integrates OWASP ZAP for vulnerability detection
Browser Analysis: Playwright-based console error and network analysis
Header Security: Checks HTTP security headers and TLS configuration
Async Processing: Celery workers for background scan processing
REST API: Full API access to all scanning functionality

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Frontend (Templates)                      │
│              Tailwind CSS + Alpine.js + Chart.js                │
└────────────────────────────┬────────────────────────────────────┘
                             │
┌────────────────────────────▼────────────────────────────────────┐
│                     Django REST Framework                        │
│                    /api/scans, /api/websites                    │
└────────────────────────────┬────────────────────────────────────┘
                             │
          ┌──────────────────┼──────────────────┐
          │                  │                  │
┌─────────▼─────────┐ ┌──────▼──────┐ ┌────────▼────────┐
│   PostgreSQL DB   │ │    Redis    │ │  Celery Worker  │
│  Scans, Issues,   │ │Message Queue│ │  Background     │
│     Metrics       │ │             │ │  Processing     │
└───────────────────┘ └─────────────┘ └────────┬────────┘
                                               │
        ┌──────────────────────────────────────┼──────────────────────┐
        │                                      │                      │
┌───────▼───────┐ ┌─────────────────┐ ┌────────▼────────┐ ┌──────────▼──────────┐
│  Lighthouse   │ │   OWASP ZAP     │ │   Playwright    │ │   Headers Scanner   │
│   (Node.js)   │ │   (Docker)      │ │   (Python)      │ │   (requests/ssl)    │
│   Port 3001   │ │   Port 8081     │ │                 │ │                     │
└───────────────┘ └─────────────────┘ └─────────────────┘ └─────────────────────┘

Quick Start

Prerequisites

Docker & Docker Compose
Git

1. Clone and Configure

git clone <repository-url>
cd secure-web

# Copy environment file
cp backend/.env.example backend/.env

# Edit .env with your settings (optional for development)

2. Start the Stack

# Build and start all services
docker-compose up --build -d

# View logs
docker-compose logs -f

# Check service status
docker-compose ps

3. Initialize Database

# Run migrations
docker-compose exec web python manage.py migrate

# Create superuser (optional)
docker-compose exec web python manage.py createsuperuser

4. Access the Application

Web Interface: http://localhost:8000
Admin Panel: http://localhost:8000/admin
API Documentation: http://localhost:8000/api/

Running a Scan

Via Web Interface

Navigate to http://localhost:8000
Enter a URL in the input field (e.g., https://example.com)
Click "Scan Website"
Wait for the scan to complete (typically 1-3 minutes)
View results including scores, metrics, and issues

Via API

# Create a new scan
curl -X POST http://localhost:8000/api/scans/ \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

# Response:
# {
#   "id": "uuid-here",
#   "url": "https://example.com",
#   "status": "pending",
#   ...
# }

# Check scan status
curl http://localhost:8000/api/scans/{scan-id}/

# List all scans
curl http://localhost:8000/api/scans/

# Get issues for a scan
curl "http://localhost:8000/api/issues/?scan={scan-id}"

API Endpoints

Method	Endpoint	Description
GET	`/api/scans/`	List all scans
POST	`/api/scans/`	Create new scan
GET	`/api/scans/{id}/`	Get scan details
GET	`/api/websites/`	List all websites
GET	`/api/issues/`	List all issues
GET	`/api/issues/?scan={id}`	Issues for specific scan
GET	`/api/issues/?severity=high`	Filter by severity

Scanner Integration

Lighthouse (Performance)

The Lighthouse scanner runs as a separate Node.js service and provides:

Performance Score: Overall performance rating
Core Web Vitals: FCP, LCP, CLS, TTI, TBT
Resource Analysis: Unused JS, render-blocking resources
Best Practices: Modern web development compliance

# Internal service call
POST http://lighthouse:3001/scan
{
    "url": "https://example.com",
    "options": {
        "preset": "desktop"
    }
}

OWASP ZAP (Security)

ZAP performs active security scanning:

Spider Crawling: Discovers URLs and entry points
Passive Scanning: Analyzes responses for vulnerabilities
Alert Detection: XSS, injection, misconfigurations

# ZAP API endpoints used
GET http://zap:8081/JSON/spider/action/scan/
GET http://zap:8081/JSON/pscan/view/recordsToScan/
GET http://zap:8081/JSON/core/view/alerts/

Playwright (Browser Analysis)

Playwright performs real browser analysis:

Console Errors: JavaScript errors and warnings
Network Metrics: Response times, failed requests
Memory Metrics: JS heap size monitoring
Resource Loading: Images, scripts, stylesheets

Headers Scanner (HTTP Security)

Checks security headers and TLS configuration:

Security Headers: CSP, HSTS, X-Frame-Options, etc.
Cookie Security: Secure, HttpOnly, SameSite flags
TLS Certificate: Validity, expiration, issuer
Information Disclosure: Server version headers

Configuration

Environment Variables

# Django
SECRET_KEY=your-secret-key
DEBUG=True
ALLOWED_HOSTS=localhost,127.0.0.1

# Database
DATABASE_URL=postgres://user:pass@db:5432/secure_web

# Redis
REDIS_URL=redis://redis:6379/0
CELERY_BROKER_URL=redis://redis:6379/0

# Scanner Services
LIGHTHOUSE_URL=http://lighthouse:3001
ZAP_API_URL=http://zap:8081
ZAP_API_KEY=changeme

# Scanner Timeouts
LIGHTHOUSE_TIMEOUT=120
ZAP_TIMEOUT=300
PLAYWRIGHT_TIMEOUT=60

Scanner Configuration

Modify backend/core/settings.py:

SCANNER_CONFIG = {
    'lighthouse': {
        'url': os.getenv('LIGHTHOUSE_URL', 'http://lighthouse:3001'),
        'timeout': int(os.getenv('LIGHTHOUSE_TIMEOUT', '120')),
        'preset': 'desktop',  # or 'mobile'
    },
    'zap': {
        'url': os.getenv('ZAP_API_URL', 'http://zap:8081'),
        'api_key': os.getenv('ZAP_API_KEY', 'changeme'),
        'timeout': int(os.getenv('ZAP_TIMEOUT', '300')),
        'spider_max_depth': 3,
    },
    'playwright': {
        'timeout': int(os.getenv('PLAYWRIGHT_TIMEOUT', '60')),
        'viewport': {'width': 1920, 'height': 1080},
    },
    'headers': {
        'timeout': 30,
        'verify_ssl': True,
    },
}

Development

Running Locally (without Docker)

# Backend setup
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Set environment
export DATABASE_URL=postgres://user:pass@localhost:5432/secure_web
export REDIS_URL=redis://localhost:6379/0

# Run migrations
python manage.py migrate

# Start Django
python manage.py runserver

# Start Celery (separate terminal)
celery -A core worker -l INFO

# Start Celery Beat (separate terminal)
celery -A core beat -l INFO

Running Tests

# Run all tests
docker-compose exec web pytest

# Run specific test file
docker-compose exec web pytest tests/test_validators.py -v

# Run with coverage
docker-compose exec web pytest --cov=. --cov-report=html

# Local testing
cd backend
pytest tests/ -v

Code Structure

secure-web/
├── backend/
│   ├── core/                  # Django project settings
│   │   ├── settings.py
│   │   ├── urls.py
│   │   ├── celery.py
│   │   └── wsgi.py
│   ├── websites/              # Main app - models
│   │   ├── models.py          # Website, Scan, Issue, Metric
│   │   └── admin.py
│   ├── api/                   # DRF API
│   │   ├── views.py
│   │   ├── serializers.py
│   │   └── urls.py
│   ├── scanner/               # Scanner modules
│   │   ├── base.py            # BaseScanner ABC
│   │   ├── validators.py      # URL validation, SSRF protection
│   │   ├── headers_scanner.py
│   │   ├── lighthouse_scanner.py
│   │   ├── playwright_scanner.py
│   │   ├── zap_scanner.py
│   │   ├── runner.py          # Orchestrator
│   │   └── tasks.py           # Celery tasks
│   ├── templates/             # Frontend templates
│   │   ├── base.html
│   │   ├── index.html
│   │   └── scan_detail.html
│   └── tests/                 # Unit tests
│       ├── test_validators.py
│       ├── test_scans.py
│       └── test_scanner_parsing.py
├── lighthouse/                # Lighthouse Node.js service
│   ├── server.js
│   ├── package.json
│   └── Dockerfile
└── docker-compose.yml

Issue Categories

Category	Source	Description
`performance`	Lighthouse	Speed, loading, rendering issues
`security`	ZAP, Headers	Vulnerabilities, misconfigurations
`accessibility`	Lighthouse	WCAG compliance issues
`seo`	Lighthouse	Search optimization issues
`best_practices`	Lighthouse	Modern web standards
`console_errors`	Playwright	JavaScript runtime errors
`network`	Playwright	Failed requests, slow responses
`headers`	Headers	Missing security headers
`tls`	Headers	Certificate issues
`cookies`	Headers	Insecure cookie settings

Issue Severities

Level	Color	Description
`critical`	Red	Immediate action required
`high`	Orange	Significant security/performance risk
`medium`	Yellow	Should be addressed
`low`	Blue	Minor improvement
`info`	Gray	Informational only

Troubleshooting

Common Issues

Services not starting:

# Check logs
docker-compose logs web
docker-compose logs celery_worker
docker-compose logs lighthouse
docker-compose logs zap

# Restart services
docker-compose restart

Database connection errors:

# Wait for DB to be ready
docker-compose exec web python manage.py wait_for_db

# Check DB status
docker-compose exec db psql -U secure_web -c "\l"

ZAP not responding:

# ZAP takes time to start, wait 30-60 seconds
docker-compose logs zap

# Check ZAP status
curl http://localhost:8081/JSON/core/view/version/

Scan stuck in pending:

# Check Celery worker
docker-compose logs celery_worker

# Restart worker
docker-compose restart celery_worker

Performance Tips

For production, use a dedicated ZAP instance
Consider caching Lighthouse results for repeated scans
Adjust timeouts based on target website complexity
Use Redis persistence for task queue durability

Security Considerations

URL validation includes SSRF protection (blocks private IPs)
ZAP API key should be changed in production
Consider rate limiting scan endpoints
Validate and sanitize all user inputs
Run containers with minimal privileges

License

MIT License - See LICENSE file for details.

Contributing

Fork the repository
Create a feature branch
Write tests for new functionality
Submit a pull request

Support

For issues and feature requests, please use the GitHub issue tracker.

13 KiB Raw Permalink Blame History