SLTrans简易教程git lfs install # 确保你安装了git-lfs
git clone https://hf-mirror.com/datasets/UKPLab/SLTrans
cd SLTrans
git lfs pull
git checkout main
然后把下面这个脚本放到目录里去:
import os
import pandas as pd
def list_parquet_files(directory):
"""递归列出目录中所有 Parquet 文件,并打印出路径"""
print(f"正在搜索目录: {directory}")
parquet_files = []
for root, _, files in os.walk(directory):
for file in files:
if file.endswith(".parquet"):
parquet_files.append(os.path.join(root, ...