敏感信息过滤Hook - 学习笔记-Hooks案例

核心思路：敏感信息过滤Hook通过在AI对话的"前处理"和"后处理"两个环节设置检查点，自动检测并拦截可能泄露的敏感数据。前置Hook在用户输入发送给AI之前进行检测和脱敏，后置Hook在AI返回结果后审查是否意外包含了敏感内容。这种双层防护机制可以有效防止API Key、密码、个人身份信息（PII）等敏感数据通过AI对话渠道泄露。

一、敏感信息过滤Hook的设计

随着AI辅助编程工具的广泛使用，开发者日常工作中越来越多地将代码片段、配置文件和业务数据输入到AI对话中。这些内容中可能包含API密钥、数据库密码、个人身份信息等敏感数据，如果不加过滤直接发送给AI，可能导致敏感信息泄露到第三方服务。

敏感信息过滤Hook的设计需要关注两个核心环节：

前置过滤（Before Hook）：在用户输入发送给AI之前执行检测，这是最主要的防线。可以在用户尚未发送时就给出警告，或者自动对敏感内容进行脱敏处理。
后置审查（After Hook）：在AI返回结果后进行检查，防止AI生成的代码或文本中意外包含了敏感信息（例如AI记忆了之前对话中的密钥并再次输出）。

设计原则：过滤Hook应当遵循"默认拦截、按需放行"的安全策略。所有潜在地包含敏感信息的输入都应被检测，只有用户明确确认安全后才能放行。同时，脱敏处理应当是可逆的（在本地记录映射关系）或不可逆的（对于不需要原文的场景使用哈希处理），具体取决于使用场景。

Hook的基本骨架

// 敏感信息过滤Hook的基础架构
class SensitiveInfoFilter {
    constructor(options = {}) {
        this.patterns = options.patterns || this.getDefaultPatterns();
        this.mode = options.mode || 'warn'; // 'warn' | 'block' | 'mask'
        this.logger = options.logger || console;
        this.maskChar = options.maskChar || '*';
        this.enableBeforeHook = options.enableBeforeHook !== false;
        this.enableAfterHook = options.enableAfterHook !== false;
    }

    // 前置过滤：在发送给AI之前执行
    async beforeHook(input) {
        if (!this.enableBeforeHook) return input;
        const findings = this.detect(input);
        if (findings.length === 0) return input;
        return this.handleFindings(input, findings);
    }

    // 后置审查：在收到AI响应后执行
    async afterHook(output) {
        if (!this.enableAfterHook) return output;
        const findings = this.detect(output);
        if (findings.length === 0) return output;
        this.logger.warn('[后置审查] AI输出中检测到敏感信息', findings);
        return this.maskContent(output, findings);
    }

    // 检测敏感信息
    detect(text) {
        const results = [];
        for (const [type, pattern] of Object.entries(this.patterns)) {
            let match;
            const regex = new RegExp(pattern, 'gi');
            while ((match = regex.exec(text)) !== null) {
                results.push({ type, index: match.index, matched: match[0] });
            }
        }
        return results;
    }

    // 处理检测结果
    handleFindings(input, findings) {
        switch (this.mode) {
            case 'block':
                this.logger.error('[阻断] 输入包含敏感信息，已阻止发送', findings);
                return null; // 阻断发送
            case 'mask':
                const masked = this.maskContent(input, findings);
                this.logger.warn('[脱敏] 已自动脱敏', findings);
                return masked;
            case 'warn':
            default:
                this.logger.warn('[警告] 输入包含敏感信息，请确认', findings);
                return input; // 仅警告，不阻断
        }
    }

    maskContent(text, findings) {
        let result = text;
        // 从后往前替换，避免索引偏移
        const sorted = findings.sort((a, b) => b.index - a.index);
        for (const f of sorted) {
            const masked = f.matched.charAt(0) +
                this.maskChar.repeat(Math.max(f.matched.length - 2, 0)) +
                f.matched.charAt(f.matched.length - 1);
            result = result.slice(0, f.index) + masked +
                result.slice(f.index + f.matched.length);
        }
        return result;
    }
}

二、API Key/Token/密码检测Hook（before）

API Key和Token是开发者在AI对话中最容易泄露的敏感信息。在编写代码或调试时，开发者可能会将包含真实密钥的配置片段直接粘贴到对话中。API Key/Token检测Hook使用正则表达式匹配常见的密钥格式，在用户输入发送给AI之前进行检测和拦截。

支持的密钥格式

OpenAI API Key：以 sk- 开头的密钥，后续跟随至少20个字符（字母和数字）
GitHub Personal Access Token：以 ghp_、gho_、ghu_、ghs_、ghr_ 开头的Token
AWS Access Key：以 AKIA 开头，后跟16个大写字母或数字的访问密钥
通用密码和连接字符串：包含 password=、secret=、pwd= 等关键词的文本
JWT Token：由三部分Base64编码组成的JSON Web Token
SSH私钥：以 -----BEGIN ... PRIVATE KEY----- 为标记的密钥块

// API Key/Token检测Hook实现
class ApiKeyDetector {
    constructor() {
        this.keyPatterns = {
            'OpenAI API Key': /sk-[A-Za-z0-9]{20,}/,
            'GitHub Token': /gh[opsur]_[A-Za-z0-9_]{36,}/,
            'AWS Access Key': /AKIA[0-9A-Z]{16}/,
            'JWT Token': /eyJ[A-Za-z0-9_-]{10,}\.[A-Za-z0-9_-]{10,}\.[A-Za-z0-9_-]{10,}/,
            'SSH Private Key': /-----BEGIN\s+(RSA|DSA|EC|OPENSSH)\s+PRIVATE\s+KEY-----/,
        };
    }

    // 用户自定义模式扩展接口
    addCustomPattern(name, regex) {
        if (regex instanceof RegExp) {
            this.keyPatterns[name] = regex;
        } else {
            this.keyPatterns[name] = new RegExp(regex, 'i');
        }
    }

    // 检测并处理
    async beforeHook(input) {
        const detections = [];
        for (const [name, pattern] of Object.entries(this.keyPatterns)) {
            const matches = input.match(pattern);
            if (matches) {
                detections.push({ type: name, found: matches.length });
            }
        }
        if (detections.length > 0) {
            const summary = detections.map(d =>
                `${d.type} (发现 ${d.found} 处)`).join('、');
            // 默认行为：警告+阻断建议
            const userConfirmed = await this.askUserConfirm(
                `检测到敏感密钥信息：${summary}\n是否继续发送？（y/n）`
            );
            if (!userConfirmed) {
                throw new Error('已阻止包含密钥信息的消息发送');
            }
        }
        return input;
    }

    // 命令行确认交互
    async askUserConfirm(message) {
        // 在实际CLI环境中使用 readline 或 对话框
        console.warn(`[安全警告] ${message}`);
        // 出于安全考虑，默认返回false（阻断）
        return false;
    }
}

// 使用示例
const keyDetector = new ApiKeyDetector();
// 添加自定义密钥模式
keyDetector.addCustomPattern('WeChat Secret', /wx[0-9a-f]{32}/i);
keyDetector.addCustomPattern('Stripe Key', /sk_live_[A-Za-z0-9]{30,}/);

重要提醒：密钥检测应当尽可能全面但也要避免误报。例如，代码中常见的占位符 your-api-key-here 或示例代码中的演示密钥不应被误判为真实密钥。建议维护一个"白名单"模式，对明显是示例内容的片段进行豁免。另外，检测到密钥后，应当向用户展示检测到的具体上下文（但将密钥本身部分脱敏），帮助用户判断是否为误报。

三、个人信息（PII）检测和脱敏Hook

个人身份信息（Personally Identifiable Information, PII）包括身份证号、手机号码、电子邮箱、银行卡号、家庭住址等能够直接或间接识别特定自然人身份的信息。在AI对话中，用户可能在粘贴业务数据或日志时无意中包含这些信息。PII检测和脱敏Hook能够在发送前自动识别并替换这些敏感信息。

支持检测的PII类型

PII类型	匹配规则	脱敏示例
中国身份证号	18位数字（末位可能为X）	110101**1234**
中国大陆手机号	1开头的11位数字	138****1234
电子邮箱	标准邮箱格式	u***@example.com
银行卡号	16-19位数字	6222********1234
IP地址	IPv4/IPv6地址格式	192.168..

// PII检测和脱敏Hook实现
class PIIDetector {
    constructor() {
        this.piiPatterns = {
            '身份证号': /[1-9]\d{5}(?:18|19|20)\d{2}(?:0[1-9]|1[0-2])(?:0[1-9]|[12]\d|3[01])\d{3}[\dXx]/g,
            '手机号': /1[3-9]\d{9}/g,
            '邮箱': /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g,
            '银行卡号': /\b(?:62|60|58|56|55|54|53|52|51|50|49|48|47|46|45|44|43|42|41|40)\d{14,17}\b/g,
            'IPv4地址': /\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b/g,
        };
        this.maskLog = []; // 脱敏日志
    }

    // 根据不同PII类型执行不同的脱敏策略
    maskPII(text, type) {
        const pattern = this.piiPatterns[type];
        if (!pattern) return text;

        return text.replace(pattern, (match) => {
            const masked = this.applyMasking(match, type);
            // 记录脱敏日志
            this.maskLog.push({
                type,
                timestamp: new Date().toISOString(),
                originalLength: match.length,
                maskedValue: masked,
                // 注意：不应记录原始值，这里只记录元数据用于审计
            });
            return masked;
        });
    }

    // 针对不同类型应用不同的脱敏策略
    applyMasking(value, type) {
        switch (type) {
            case '手机号':
                // 保留前3位和后4位，中间替换为 ****
                return value.slice(0, 3) + '****' + value.slice(-4);

            case '邮箱':
                // 保留用户名首字母和域名
                const [local, domain] = value.split('@');
                return local.charAt(0) + '***@' + domain;

            case '身份证号':
                // 保留前6位和后4位
                return value.slice(0, 6) + '********' + value.slice(-4);

            case '银行卡号':
                // 保留前4位和后4位
                return value.slice(0, 4) + ' **** **** ' + value.slice(-4);

            case 'IPv4地址':
                // 仅保留前两段
                const parts = value.split('.');
                return `${parts[0]}.${parts[1]}.*.*`;

            default:
                // 通用脱敏：保留首尾各1个字符
                return value.charAt(0) + '*'.repeat(value.length - 2) + value.charAt(value.length - 1);
        }
    }

    // 批量脱敏
    maskAll(text) {
        let result = text;
        for (const type of Object.keys(this.piiPatterns)) {
            result = this.maskPII(result, type);
        }
        return result;
    }

    // 前置Hook入口
    async beforeHook(input) {
        const result = this.maskAll(input);
        if (result !== input) {
            console.info(`[PII脱敏] 已自动脱敏 ${this.maskLog.length} 处敏感信息`);
            console.info(`[PII脱敏] 类型分布:`, this.getMaskSummary());
        }
        return result;
    }

    getMaskSummary() {
        const summary = {};
        for (const entry of this.maskLog) {
            summary[entry.type] = (summary[entry.type] || 0) + 1;
        }
        return summary;
    }

    // 导出审计日志
    exportAuditLog() {
        return [...this.maskLog];
    }

    // 清空日志
    clearAuditLog() {
        this.maskLog = [];
    }
}

// 使用示例
const piiFilter = new PIIDetector();
const userInput = `我的手机是13812345678，邮箱是test@example.com`;
const filtered = await piiFilter.beforeHook(userInput);
// 结果: "我的手机是138****5678，邮箱是t***@example.com"

最佳实践：PII脱敏应当在本地完成，确保原始PII数据永远不会离开用户设备。脱敏日志可以记录"在何时对何种类型的PII进行了脱敏处理"，但不应该记录脱敏前的原始值。对于需要审计合规的场景，可以记录脱敏操作的次数和类型分布，用于满足GDPR、个人信息保护法等法规的审计要求。

四、代码敏感配置过滤Hook

在AI辅助编程场景中，开发者经常将代码文件或代码段粘贴到对话中请求AI帮助改进。这些代码中可能包含硬编码的数据库连接字符串、密码、密钥等敏感配置。代码敏感配置过滤Hook专注于检测即将发给AI的代码内容中是否包含这些敏感配置项，并在写入文件时也有保护机制。

检测场景

数据库连接字符串：检测 mysql://、postgres://、mongodb:// 等URL中包含密码的模式
配置文件中的密钥：检测 application.yml、.env、config.json 等配置文件中的敏感字段
硬编码密码：检测代码中的 password = "xxx"、pwd: 'xxx' 等赋值模式
Git提交中的密钥：扫描Git diff中新增的密钥或敏感配置

// 代码敏感配置过滤Hook
class CodeConfigFilter {
    constructor(options = {}) {
        this.sensitiveKeys = options.sensitiveKeys || [
            'password', 'passwd', 'pwd', 'secret',
            'api_key', 'apikey', 'api-secret',
            'access_key', 'accesskey',
            'private_key', 'privatekey',
            'token', 'auth_token',
            'connection_string', 'connstr',
            'jwt_secret', 'session_secret',
            'db_password', 'db_passwd',
            'redis_password', 'rabbitmq_password',
            'aws_secret_access_key',
            'slack_token', 'slack_webhook',
            'stripe_sk', 'stripe_secret',
        ];
        this.sensitiveFilePatterns = options.sensitiveFilePatterns || [
            /\.env(\.\w+)?$/,
            /\-secret\.(yaml|yml|json)$/,
            /credentials\.(json|ini|conf)$/,
            /\.npmrc$/,
            /\.netrc$/,
            /config\.(php|rb|py)$/i,
        ];
        // 检测环境变量赋值
        this.envVarPattern = new RegExp(
            `(${this.sensitiveKeys.join('|')})\\s*[=:]\\s*['"]?[^'"\\s]{4,}['"]?`,
            'gi'
        );
        // 检测数据库连接字符串
        this.connStrPattern = /\w+:\/\/[^:]+:[^@]+@[^\/\s]+\//gi;
    }

    // 检测代码中的敏感配置
    detectInCode(code, filename = '') {
        const findings = [];

        // 检查文件名是否匹配敏感文件
        if (filename) {
            for (const pattern of this.sensitiveFilePatterns) {
                if (pattern.test(filename)) {
                    findings.push({
                        type: '敏感配置文件',
                        detail: `文件名 "${filename}" 匹配敏感文件模式 ${pattern}`,
                        severity: 'high',
                    });
                }
            }
        }

        // 检查环境变量赋值
        let match;
        while ((match = this.envVarPattern.exec(code)) !== null) {
            const keyName = match[1];
            const valuePreview = match[0].substring(match[0].indexOf('=') + 1);
            findings.push({
                type: '敏感配置项',
                key: keyName,
                detail: `检测到敏感配置项 "${keyName}" 的赋值`,
                severity: keyName.includes('password') ? 'critical' : 'high',
            });
        }

        // 检查数据库连接字符串
        while ((match = this.connStrPattern.exec(code)) !== null) {
            findings.push({
                type: '数据库连接字符串',
                detail: '检测到包含密码的数据库连接字符串',
                severity: 'critical',
            });
        }

        return findings;
    }

    // 替换敏感值为占位符
    sanitizeCode(code) {
        let result = code;
        // 替换密码值
        result = result.replace(this.envVarPattern, (match, key) => {
            return `${key}=******`;
        });
        // 替换连接字符串中的密码
        result = result.replace(this.connStrPattern, (url) => {
            return url.replace(/:([^@]+)@/, ':******@');
        });
        return result;
    }

    // 前置Hook：检测并处理代码片段
    async beforeHook(input) {
        const findings = this.detectInCode(input);
        if (findings.length > 0) {
            console.warn('[代码安全] 检测到以下敏感配置:');
            findings.forEach(f => console.warn(`  [${f.severity}] ${f.detail}`));
            // 自动脱敏处理
            return this.sanitizeCode(input);
        }
        return input;
    }

    // Git Diff 扫描：检测即将提交的密钥
    scanGitDiff(diffText) {
        const findings = this.detectInCode(diffText);
        if (findings.length > 0) {
            console.error(
                '[Git安全] 警告：检测到新增的敏感配置项，请使用环境变量代替！'
            );
            findings.forEach(f => console.error(`  位置: ${f.detail}`));
        }
        return findings;
    }
}

// 使用示例
const configFilter = new CodeConfigFilter();

// 检测代码片段
const codeSnippet = `
const db = mysql.createConnection({
    host: 'localhost',
    user: 'root',
    password: 'MyRealPassword123!',
    database: 'production'
});
`;
const safeCode = await configFilter.beforeHook(codeSnippet);
// password值已被替换为 ******

警惕Git泄露：硬编码的密钥一旦被提交到Git仓库，即使后续删除，也会永久保留在Git历史中。建议在CI/CD流水线中集成 git-secrets 或 talisman 等工具，在 pre-commit 阶段自动扫描新增的密钥。同时，.gitignore 中应当始终包含 .env 和配置文件模板中的敏感字段。

五、AI输出内容后置审查Hook（after）

前置Hook主要保护用户输入中的敏感信息不被发送给AI，但还有一种场景容易被忽视：AI可能在回复中"回忆"出之前对话中出现的敏感信息（特别是使用长上下文窗口时），或者AI生成的代码示例中包含了真实的密钥占位符被替换成了实际值。后置审查Hook专门处理这类问题。

后置审查的典型场景

上下文泄露：AI从之前对话历史中提取了用户传入的密钥并再次输出
代码生成意外包含：AI生成的配置文件中自动填充了敏感值
日志输出泄露：AI建议的日志代码可能输出敏感信息到日志文件
URL中的敏感参数：AI生成的链接或API调用URL中可能包含Token

// AI输出内容后置审查Hook
class OutputReviewHook {
    constructor(options = {}) {
        this.reviewPatterns = options.reviewPatterns || {
            // 复用API Key检测模式
            ...new ApiKeyDetector().keyPatterns,
            // 新增仅用于输出的审查模式
            '敏感文件路径泄露': /\/home\/[^\/]+\/\.(ssh|aws|config|docker)\//,
            '内网地址泄露': /(?:10\.\d{1,3}\.|172\.(?:1[6-9]|2\d|3[01])\.|192\.168\.)/,
            '调试Token泄露': /(?:Bearer|token)\s+[A-Za-z0-9_\-]{20,}/i,
        };
        this.replacementStrategy = options.replacementStrategy || 'warn';
        // 'warn': 仅警告，不修改
        // 'replace': 自动替换为占位符
        // 'block': 完全阻止输出显示
    }

    // 审查AI输出内容
    async afterHook(output) {
        const findings = [];
        for (const [name, pattern] of Object.entries(this.reviewPatterns)) {
            const matches = output.match(new RegExp(pattern, 'gi'));
            if (matches) {
                findings.push({
                    type: name,
                    count: matches.length,
                    severity: this.judgeSeverity(name, matches),
                });
            }
        }

        if (findings.length === 0) return output;

        console.warn('[输出审查] AI回复中检测到潜在敏感信息:');
        findings.forEach(f =>
            console.warn(`  [${f.severity}] ${f.type} (发现 ${f.count} 处)`)
        );

        switch (this.replacementStrategy) {
            case 'replace':
                return this.applyReplacements(output, findings);
            case 'block':
                throw new Error(
                    'AI回复已被后置审查拦截，原因：检测到潜在敏感信息'
                );
            case 'warn':
            default:
                // 附加安全提示到输出末尾，不修改原始内容
                return output + `

---
⚠️ 安全提示：以上回复中检测到潜在敏感信息，请在复制使用时注意脱敏。
检测项：${findings.map(f => `${f.type}(${f.count}处)`).join('、')}`;
        }
    }

    judgeSeverity(type, matches) {
        // 根据匹配类型判断严重程度
        const criticalTypes = [
            'OpenAI API Key', 'AWS Access Key', 'SSH Private Key'
        ];
        if (criticalTypes.includes(type)) return '严重';
        if (matches.some(m => m.length > 30)) return '高危';
        return '中危';
    }

    applyReplacements(output, findings) {
        let result = output;
        for (const [name, pattern] of Object.entries(this.reviewPatterns)) {
            const regex = new RegExp(pattern, 'gi');
            result = result.replace(regex, (match) => {
                if (match.length > 8) {
                    return match.slice(0, 4) + '****' + match.slice(-4);
                }
                return '******';
            });
        }
        return result;
    }
}

// 使用示例
const outputReview = new OutputReviewHook({ replacementStrategy: 'replace' });
const aiResponse = `请使用以下配置连接数据库：
postgres://admin:MyRealPassword@prod-db.example.com:5432/mydb
API密钥：sk-proj-ABC123DEF456GHI789JKL`;

const safeResponse = await outputReview.afterHook(aiResponse);
// 数据库密码和API密钥已被脱敏处理

后置审查的局限性：后置审查无法防止敏感信息已经通过网络发送到AI服务提供商，因此它不能替代前置过滤。后置审查更适合作为"安全网"，用于捕获前置过滤遗漏的场景，或者检测AI输出中新产生的敏感内容。对于高安全要求的场景，应当同时启用前置过滤和后置审查，形成双层防护。

六、多层过滤策略整合

将上述所有Hook整合到一个完整的过滤链中，可以实现全方位防护。以下代码展示了如何将这些Hook串联成一个统一的过滤管道。

// 完整的多层过滤链整合
class SensitiveInfoFilterChain {
    constructor() {
        this.beforeHooks = [];
        this.afterHooks = [];
        this.initDefaultHooks();
    }

    initDefaultHooks() {
        // 1. API Key/Token检测（阻断模式）
        this.beforeHooks.push(new ApiKeyDetector());
        // 2. PII脱敏处理（自动脱敏）
        this.beforeHooks.push(new PIIDetector());
        // 3. 代码敏感配置过滤（自动脱敏）
        this.beforeHooks.push(new CodeConfigFilter());
        // 4. AI输出后置审查（替换模式）
        this.afterHooks.push(new OutputReviewHook({
            replacementStrategy: 'replace'
        }));
    }

    // 注册自定义前置Hook
    registerBeforeHook(hook) {
        this.beforeHooks.push(hook);
    }

    // 注册自定义后置Hook
    registerAfterHook(hook) {
        this.afterHooks.push(hook);
    }

    // 执行完整的过滤流程
    async execute(userInput) {
        console.info('[过滤链] 开始前置过滤处理...');

        let processedInput = userInput;
        for (const hook of this.beforeHooks) {
            try {
                processedInput = await hook.beforeHook(processedInput);
                if (processedInput === null) {
                    console.error('[过滤链] 输入已被阻断');
                    return { blocked: true, reason: '前缀过滤链阻断' };
                }
            } catch (error) {
                console.error(`[过滤链] Hook执行错误: ${error.message}`);
                return { blocked: true, reason: error.message };
            }
        }

        console.info('[过滤链] 前置过滤完成，发送到AI...');

        // 模拟AI处理
        const aiResponse = await this.callAI(processedInput);

        console.info('[过滤链] 开始后置审查...');

        let processedOutput = aiResponse;
        for (const hook of this.afterHooks) {
            try {
                processedOutput = await hook.afterHook(processedOutput);
            } catch (error) {
                console.error(`[过滤链] 后置审查拦截: ${error.message}`);
                return {
                    blocked: true,
                    reason: error.message,
                    partialOutput: processedOutput,
                };
            }
        }

        console.info('[过滤链] 过滤流程完成');
        return { blocked: false, output: processedOutput };
    }

    // 模拟AI调用
    async callAI(input) {
        // 此处为实际AI API调用
        return `模拟AI回复: ${input}`;
    }

    // 获取当前过滤配置摘要
    getConfigSummary() {
        return {
            beforeHooks: this.beforeHooks.map(h => h.constructor.name),
            afterHooks: this.afterHooks.map(h => h.constructor.name),
            totalHooks: this.beforeHooks.length + this.afterHooks.length,
        };
    }
}

// 使用示例
const filterChain = new SensitiveInfoFilterChain();
console.info('当前过滤链配置:', filterChain.getConfigSummary());

const userMsg = '我的数据库密码是 MyDB@2024!，邮箱是 user@example.com';
const result = await filterChain.execute(userMsg);
if (!result.blocked) {
    console.log('安全输出:', result.output);
}

架构总结：敏感信息过滤系统应当采用"管道-过滤器"架构，每个Hook只负责一类特定的检测任务，通过组合形成完整的过滤链。这种设计使得系统易于扩展：新增一种检测类型只需要创建一个新的Hook并注册到过滤链中，不需要修改现有代码。同时，每个Hook的阻断策略（warn/block/mask）可以独立配置，满足不同安全等级的需求。

七、核心要点总结

Hook类型	执行时机	主要功能	建议策略
API Key/Token检测	前置（Before）	检测密钥格式，防止泄露	block（阻断）
PII检测脱敏	前置（Before）	识别并脱敏个人身份信息	mask（自动脱敏）
代码敏感配置过滤	前置（Before）	检测硬编码配置项	mask + warn
AI输出后置审查	后置（After）	审查AI回复中的敏感内容	replace（替换）

核心原则：不要信任任何输入，也不要信任任何输出。敏感信息过滤不是可选项，而是使用AI编程工具时的必要安全措施。即使只是个人项目，养成良好的信息过滤习惯也能避免很多不必要的风险。

八、进一步思考

性能影响：对于大段文本或代码，正则匹配可能带来性能开销。可以考虑使用Aho-Corasick等多模式匹配算法提高检测效率，或者将检测逻辑放在Web Worker中异步执行避免阻塞主线程。
误报处理：过滤Hook需要平衡安全性和用户体验。过多的误报会导致用户关闭过滤功能。建议对检测结果进行"置信度评分"，只有超过阈值才执行阻断或警告。
持久化配置：用户可能希望自定义哪些模式需要过滤、哪些可以豁免。提供用户级别的配置文件（如 .sensitive-filter-config.json）可以让过滤规则更加灵活。
审核与合规：在企业环境中，敏感信息过滤应当与审计系统集成，记录所有检测事件（但不记录原始敏感值），以满足SOC2、ISO 27001等安全合规要求。
方言支持：除了常见的PII格式，还可以扩展支持其他国家和地区的身份信息格式（如美国SSN、英国NINO、日本My Number等），适应国际化场景。