Linux 系统中模板目录下存在文件名显示为乱码_网站&服务器&网络安全_产品知识库

Linux 系统中模板目录下存在文件名显示为乱码

来源：本站原创点击数：次发布时间：2025年11月12日

问题描述：Linux 系统中模板目录下存在文件名显示为乱码

解决方法：

Linux 本身不限制文件名的编码，但终端和 shell 默认按 UTF-8 解释。如果文件名原本是用其他编码（如 GBK）创建的，在 UTF-8 终端下就会显示为乱码。

1：列出所有文件并人工识别

单个目录测试是否存在乱码文件

ls -la /data/new/webfuture/Views/jyjy/Pentition观察输出中是否有乱码文件。

整个网站目录批量测试是否存在乱码文件

以/data/www/webfutre目录为例，将下面保存为例如find.sh，给执行权限后再终端运行：/data/find.sh > /data/find.txt，最终查看 /data/find.txt文件

#!/bin/bash

TARGET_DIR="/data/www/webfuture"

# 确保目录存在
if [ ! -d "$TARGET_DIR" ]; then
    echo "Error: Directory $TARGET_DIR does not exist."
    exit 1
fi

echo "Scanning for invalid UTF-8 filenames under $TARGET_DIR ..."

export LC_ALL=C  # 确保原始字节不被 locale 干扰

find "$TARGET_DIR" -type f -print0 | \
while IFS= read -r -d '' file; do
    # 检查文件名是否为合法 UTF-8
    if ! printf "%s" "$file" | iconv -f UTF-8 -t UTF-8 >/dev/null 2>&1; then
        echo "INVALID UTF-8 FILENAME: $file"
    fi
done

2：使用脚本修复全部文件

#!/usr/bin/env python3
import os
import sys

def safe_print(s):
    """安全打印可能包含 surrogate 的字符串"""
    try:
        print(s)
    except UnicodeEncodeError:
        # 替换 surrogate 为 <?> 再打印
        cleaned = s.encode('utf-8', errors='replace').decode('utf-8', errors='replace')
        print(cleaned)

def fix_gbk_surrogate_filenames(root_dir):
    # 使用 os.walk 从底层开始（先子目录后父目录），避免重命名后路径失效
    for dirpath, dirnames, filenames in os.walk(root_dir, topdown=False):
        # 先处理文件
        for name in filenames:
            if not name.endswith('.cshtml'):
                continue
            try:
                # 尝试正常 encode：如果成功，说明是合法 UTF-8，跳过
                name.encode('utf-8')
                continue
            except UnicodeEncodeError:
                # 包含 surrogate，需要修复
                try:
                    # 提取原始字节（保留 GBK 编码）
                    raw_bytes = name.encode('utf-8', errors='surrogateescape')
                    # 用 GBK 解码得到正确中文
                    fixed_name = raw_bytes.decode('gbk')
                    old_path = os.path.join(dirpath, name)
                    new_path = os.path.join(dirpath, fixed_name)
                    safe_print(f"Fixing file: {old_path} → {new_path}")
                    os.rename(old_path, new_path)
                except (UnicodeDecodeError, OSError) as e:
                    safe_print(f"Failed to fix file: {name} | Error: {e}")

        # 再处理目录（同理）
        for name in dirnames:
            try:
                name.encode('utf-8')
                continue
            except UnicodeEncodeError:
                try:
                    raw_bytes = name.encode('utf-8', errors='surrogateescape')
                    fixed_name = raw_bytes.decode('gbk')
                    old_path = os.path.join(dirpath, name)
                    new_path = os.path.join(dirpath, fixed_name)
                    safe_print(f"Fixing DIR : {old_path} → {new_path}")
                    os.rename(old_path, new_path)
                except (UnicodeDecodeError, OSError) as e:
                    safe_print(f"Failed to fix DIR: {name} | Error: {e}")

if __name__ == "__main__":
    fix_gbk_surrogate_filenames("/data/www/webfuture")

将文本保存为fix_names.py，给执行权限，然后python3 fix_names.py来转换

ash编辑

# 检查是否还有 surrogate 文件名
python3 -c 
"import os
for root, dirs, files in os.walk('/data/www/test'): 
   for f in files:        
   if f.endswith('.cshtml'):            
   try:                
   f.encode('utf-8')            
   except UnicodeEncodeError:                
   print('Still broken:', os.path.join(root, f))
   "
   
# 正常应无输出

所用脚本.rar