SPEC2017
SPEC2017 测试指南v1.0
快速入门与安装
下载SPEC2017
https://pan.baidu.com/s/1kMoMJ5Ufg5oZql4HjyacAg&pwd=5thr上传
将
cpu2017-1.0.5.iso
由sftp上传到某仓库路径创建文件夹
mkdir -p /home/speccpu2017
挂载镜像
mount cpu2017-1.0.5.iso /mnt/
安装镜像到home目录
cd /mnt/ && ./install.sh
输入/home/speccpu2017
后输入yes
配置SPEC2017(cfg为配置文件,这里使用手动构建的LLVM作为测试目标,见附录)
cd /home/speccpu2017/ && source shrc && cd config && cp Example-clang-llvm-linux-x86.cfg x86-llvm.cfg
修改相关参数
vim x86-llvm.cfg
并注意所有EDIT标识(修改见附录),部分参量用lscpu
命令查看(注意OPENMP,FORTRAN
要配置!)
基础使用
环境依赖
numactl ,numactl-libs, perl, gcc, gfortran
源码时效性
需要修改已经过时的部分源码:
${SPEC2017_source}/benchspec/CPU/510.parest_r/src/source/base/parameter_handler.cc
line750:AssertThrow ((s.c_str()[0] != '\0') || (*endptr == '\0'),
${SPEC2017_source}/benchspec/CPU/527.cam4_r/src/mpi.c
使用
vim
键入:%s/FORT_NAME/int FORT_NAME/g
编译选择性能测试
runcpu --config=x86-llvm.cfg --action=build all
或者fprate,fpspeed,intrate,intspeed or all
运行性能测试
runcpu --config=x86-llvm.cfg -s specspeed_int_base
编译所有题并进行测试
runcpu --config=x86-llvm-yz-basic.cfg --size=ref --tune=base --action=run all
编译单测题目
runcpu --config=x86-llvm-yz-basic.cfg --tune=base 628
(628也可以是完整的题目名)挂机完整base优化测试
nohup time -p runcpu --tune=base --rebuild --config=x86-llvm-yz-basic.cfg all >log 2>&1 &
* 注意base是基础优化,peak是激进优化,暂时还有编译bug
高级使用
结合perf (!注意perl语言版本,runcpu由perl语言书写)
perf stat -e instructions,其他数据 runcpu --config=x86-llvm-yz.cfg --size=test --iterations=1 --fake --tune=base 619.lbm_s
此方法会导致将编译过程也算在内(可以采用上述--action=build
的办法缓解)
建议使用以下脚本进行结合perf的单独测试
发现过程:runcpu --config=x86-llvm-yz-basic.cfg --size=ref --tune=base --action=run all > 1
中有运行命令
#!/bin/bash
# 清除所有现有构建项
rm -f /data/speccpu2017/result/*
cd /data/speccpu2017 && rm -f nohup.out perf_result.txt clean.log build.log
# find ./benchspec/CPU -type d -name "run_base*mytest-m64.*" -exec rm -rf {} +
runcpu --action=clean --tune=base --config=x86-llvm-yz-basic.cfg all > /data/speccpu2017/clean.log 2>&1
# 编译构建所有基准测试
runcpu --action=build --tune=base --rebuild --config=x86-llvm-yz-basic.cfg all > /data/speccpu2017/build.log 2>&1
# 557.xr存在校验码问题,不可单独提取测试
output_file="/data/speccpu2017/perf_result.txt"
# Function to run a benchmark with perf stat using numactl
run_with_perf_numactl() {
local benchmark_name=$1
local command=$2
echo "Running $benchmark_name..." >> $output_file
perf stat -e cycles,instructions --append --output=$output_file numactl -m 0 --physcpubind=39 bash -c "$command 0<&- > /dev/null 2>> /data/speccpu2017/error.log"
echo "" >> $output_file
}
# Function to run a benchmark with perf stat without numactl
run_with_perf_direct() {
local benchmark_name=$1
local command=$2
echo "Running $benchmark_name..." >> $output_file
perf stat -e cycles,instructions --append --output=$output_file bash -c "$command 0<&- > /dev/null 2>> /dev/null"
echo "" >> $output_file
}
######### SPECrate2017 整型 #########
# 500.perlbench_r
cd /data/speccpu2017/benchspec/CPU/500.perlbench_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "500.perlbench_r" "../run_base_refrate_mytest-m64.0000/perlbench_r_base.mytest-m64 -I./lib checkspam.pl 2500 5 25 11 150 1 1 1 1"
# 502.gcc_r
cd /data/speccpu2017/benchspec/CPU/502.gcc_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "502.gcc_r" "../run_base_refrate_mytest-m64.0000/cpugcc_r_base.mytest-m64 gcc-pp.c -O3 -finline-limit=0 -fif-conversion -fif-conversion2 -o gcc-pp.opts-O3_-finline-limit_0_-fif-conversion_-fif-conversion2.s"
# 505.mcf_r
cd /data/speccpu2017/benchspec/CPU/505.mcf_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "505.mcf_r" "../run_base_refrate_mytest-m64.0000/mcf_r_base.mytest-m64 inp.in"
# 520.omnetpp_r
cd /data/speccpu2017/benchspec/CPU/520.omnetpp_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "520.omnetpp_r" "../run_base_refrate_mytest-m64.0000/omnetpp_r_base.mytest-m64 -c General -r 0"
# 523.xalancbmk_r
cd /data/speccpu2017/benchspec/CPU/523.xalancbmk_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "523.xalancbmk_r" "../run_base_refrate_mytest-m64.0000/cpuxalan_r_base.mytest-m64 -v t5.xml xalanc.xsl"
# 525.x264_r
cd /data/speccpu2017/benchspec/CPU/525.x264_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "525.x264_r" "../run_base_refrate_mytest-m64.0000/x264_r_base.mytest-m64 --pass 1 --stats x264_stats.log --bitrate 1000 --frames 1000 -o BuckBunny_New.264 BuckBunny.yuv 1280x720"
# 531.deepsjeng_r
cd /data/speccpu2017/benchspec/CPU/531.deepsjeng_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "531.deepsjeng_r" "../run_base_refrate_mytest-m64.0000/deepsjeng_r_base.mytest-m64 ref.txt"
# 541.leela_r
cd /data/speccpu2017/benchspec/CPU/541.leela_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "541.leela_r" "../run_base_refrate_mytest-m64.0000/leela_r_base.mytest-m64 ref.sgf"
# 548.exchange2_r
cd /data/speccpu2017/benchspec/CPU/548.exchange2_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "548.exchange2_r" "../run_base_refrate_mytest-m64.0000/exchange2_r_base.mytest-m64 6"
######### SPECrate2017 浮点型 #########
# 503.bwaves_r
cd /data/speccpu2017/benchspec/CPU/503.bwaves_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "503.bwaves_r (bwaves_1)" "../run_base_refrate_mytest-m64.0000/bwaves_r_base.mytest-m64 bwaves_1 < bwaves_1.in"
run_with_perf_numactl "503.bwaves_r (bwaves_2)" "../run_base_refrate_mytest-m64.0000/bwaves_r_base.mytest-m64 bwaves_2 < bwaves_2.in"
run_with_perf_numactl "503.bwaves_r (bwaves_3)" "../run_base_refrate_mytest-m64.0000/bwaves_r_base.mytest-m64 bwaves_3 < bwaves_3.in"
run_with_perf_numactl "503.bwaves_r (bwaves_4)" "../run_base_refrate_mytest-m64.0000/bwaves_r_base.mytest-m64 bwaves_4 < bwaves_4.in"
# 507.cactuBSSN_r
cd /data/speccpu2017/benchspec/CPU/507.cactuBSSN_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "507.cactuBSSN_r" "../run_base_refrate_mytest-m64.0000/cactusBSSN_r_base.mytest-m64 spec_ref.par"
# 508.namd_r
cd /data/speccpu2017/benchspec/CPU/508.namd_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "508.namd_r" "../run_base_refrate_mytest-m64.0000/namd_r_base.mytest-m64 --input apoa1.input --output apoa1.ref.output --iterations 65"
# 510.parest_r
cd /data/speccpu2017/benchspec/CPU/510.parest_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "510.parest_r" "../run_base_refrate_mytest-m64.0000/parest_r_base.mytest-m64 ref.prm"
# 511.povray_r
cd /data/speccpu2017/benchspec/CPU/511.povray_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "511.povray_r" "../run_base_refrate_mytest-m64.0000/povray_r_base.mytest-m64 SPEC-benchmark-ref.ini"
# 519.lbm_r
cd /data/speccpu2017/benchspec/CPU/519.lbm_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "519.lbm_r" "../run_base_refrate_mytest-m64.0000/lbm_r_base.mytest-m64 3000 reference.dat 0 0 100_100_130_ldc.of"
# 521.wrf_r
cd /data/speccpu2017/benchspec/CPU/521.wrf_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "521.wrf_r" "../run_base_refrate_mytest-m64.0000/wrf_r_base.mytest-m64"
# 526.blender_r
cd /data/speccpu2017/benchspec/CPU/526.blender_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "526.blender_r" "../run_base_refrate_mytest-m64.0000/blender_r_base.mytest-m64 sh3_no_char.blend --render-output sh3_no_char_ --threads 1 -b -F RAWTGA -s 849 -e 849 -a"
# 527.cam4_r
cd /data/speccpu2017/benchspec/CPU/527.cam4_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "527.cam4_r" "../run_base_refrate_mytest-m64.0000/cam4_r_base.mytest-m64"
# 538.imagick_r
cd /data/speccpu2017/benchspec/CPU/538.imagick_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "538.imagick_r" "../run_base_refrate_mytest-m64.0000/imagick_r_base.mytest-m64 -limit disk 0 refrate_input.tga -edge 41 -resample 181% -emboss 31 -colorspace YUV -mean-shift 19x19+15% -resize 30% refrate_output.tga"
# 544.nab_r
cd /data/speccpu2017/benchspec/CPU/544.nab_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "544.nab_r" "../run_base_refrate_mytest-m64.0000/nab_r_base.mytest-m64 1am0 1122214447 122"
# 549.fotonik3d_r
cd /data/speccpu2017/benchspec/CPU/549.fotonik3d_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "549.fotonik3d_r" "../run_base_refrate_mytest-m64.0000/fotonik3d_r_base.mytest-m64"
# 554.roms_r
cd /data/speccpu2017/benchspec/CPU/554.roms_r/run/run_base_refrate_mytest-m64.0000
run_with_perf_numactl "554.roms_r" "../run_base_refrate_mytest-m64.0000/roms_r_base.mytest-m64 < ocean_benchmark2.in.x"
######### SPECspeed2017 整型 #########
# 600.perlbench_s
cd /data/speccpu2017/benchspec/CPU/600.perlbench_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "600.perlbench_s (checkspam)" "../run_base_refspeed_mytest-m64.0000/perlbench_s_base.mytest-m64 -I./lib checkspam.pl 2500 5 25 11 150 1 1 1 1"
run_with_perf_direct "600.perlbench_s (diffmail)" "../run_base_refspeed_mytest-m64.0000/perlbench_s_base.mytest-m64 -I./lib diffmail.pl 4 800 10 17 19 300"
run_with_perf_direct "600.perlbench_s (splitmail)" "../run_base_refspeed_mytest-m64.0000/perlbench_s_base.mytest-m64 -I./lib splitmail.pl 6400 12 26 16 100 0"
# 602.gcc_s
cd /data/speccpu2017/benchspec/CPU/602.gcc_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "602.gcc_s (opts-O5_-fipa-pta)" "../run_base_refspeed_mytest-m64.0000/sgcc_base.mytest-m64 gcc-pp.c -O5 -fipa-pta -o gcc-pp.opts-O5_-fipa-pta.s"
run_with_perf_direct "602.gcc_s (opts-O5_-finline-limit_1000_-fselective-scheduling_-fselective-scheduling2)" "../run_base_refspeed_mytest-m64.0000/sgcc_base.mytest-m64 gcc-pp.c -O5 -finline-limit=1000 -fselective-scheduling -fselective-scheduling2 -o gcc-pp.opts-O5_-finline-limit_1000_-fselective-scheduling_-fselective-scheduling2.s"
run_with_perf_direct "602.gcc_s (opts-O5_-finline-limit_24000_-fgcse_-fgcse-las_-fgcse-lm_-fgcse-sm)" "../run_base_refspeed_mytest-m64.0000/sgcc_base.mytest-m64 gcc-pp.c -O5 -finline-limit=24000 -fgcse -fgcse-las -fgcse-lm -fgcse-sm -o gcc-pp.opts-O5_-finline-limit_24000_-fgcse_-fgcse-las_-fgcse-lm_-fgcse-sm.s"
# 605.mcf_s
cd /data/speccpu2017/benchspec/CPU/605.mcf_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "605.mcf_s" "../run_base_refspeed_mytest-m64.0000/mcf_s_base.mytest-m64 inp.in"
# 620.omnetpp_s
cd /data/speccpu2017/benchspec/CPU/620.omnetpp_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "620.omnetpp_s" "../run_base_refspeed_mytest-m64.0000/omnetpp_s_base.mytest-m64 -c General -r 0"
# 623.xalancbmk_s
cd /data/speccpu2017/benchspec/CPU/623.xalancbmk_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "623.xalancbmk_s" "../run_base_refspeed_mytest-m64.0000/xalancbmk_s_base.mytest-m64 -v t5.xml xalanc.xsl"
# 625.x264_s
cd /data/speccpu2017/benchspec/CPU/625.x264_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "625.x264_s (pass 1)" "../run_base_refspeed_mytest-m64.0000/x264_s_base.mytest-m64 --pass 1 --stats x264_stats.log --bitrate 1000 --frames 1000 -o BuckBunny_New.264 BuckBunny.yuv 1280x720"
run_with_perf_direct "625.x264_s (pass 2)" "../run_base_refspeed_mytest-m64.0000/x264_s_base.mytest-m64 --pass 2 --stats x264_stats.log --bitrate 1000 --dumpyuv 200 --frames 1000 -o BuckBunny_New.264 BuckBunny.yuv 1280x720"
run_with_perf_direct "625.x264_s (seek)" "../run_base_refspeed_mytest-m64.0000/x264_s_base.mytest-m64 --seek 500 --dumpyuv 200 --frames 1250 -o BuckBunny_New.264 BuckBunny.yuv 1280x720"
# 631.deepsjeng_s
cd /data/speccpu2017/benchspec/CPU/631.deepsjeng_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "631.deepsjeng_s" "../run_base_refspeed_mytest-m64.0000/deepsjeng_s_base.mytest-m64 ref.txt"
# 641.leela_s
cd /data/speccpu2017/benchspec/CPU/641.leela_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "641.leela_s" "../run_base_refspeed_mytest-m64.0000/leela_s_base.mytest-m64 ref.sgf"
# 648.exchange2_s
cd /data/speccpu2017/benchspec/CPU/648.exchange2_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "648.exchange2_s" "../run_base_refspeed_mytest-m64.0000/exchange2_s_base.mytest-m64 6"
######### SPECspeed2017 浮点型 #########
# 603.bwaves_s
cd /data/speccpu2017/benchspec/CPU/603.bwaves_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "603.bwaves_s (bwaves_1)" "../run_base_refspeed_mytest-m64.0000/speed_bwaves_base.mytest-m64 bwaves_1 < bwaves_1.in"
run_with_perf_direct "603.bwaves_s (bwaves_2)" "../run_base_refspeed_mytest-m64.0000/speed_bwaves_base.mytest-m64 bwaves_2 < bwaves_2.in"
# 607.cactuBSSN_s
cd /data/speccpu2017/benchspec/CPU/607.cactuBSSN_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "607.cactuBSSN_s" "../run_base_refspeed_mytest-m64.0000/cactuBSSN_s_base.mytest-m64 spec_ref.par"
# 619.lbm_s
cd /data/speccpu2017/benchspec/CPU/619.lbm_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "619.lbm_s" "../run_base_refspeed_mytest-m64.0000/lbm_s_base.mytest-m64 2000 reference.dat 0 0 200_200_260_ldc.of"
# 621.wrf_s
cd /data/speccpu2017/benchspec/CPU/621.wrf_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "621.wrf_s" "../run_base_refspeed_mytest-m64.0000/wrf_s_base.mytest-m64"
# 627.cam4_s
cd /data/speccpu2017/benchspec/CPU/627.cam4_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "627.cam4_s" "../run_base_refspeed_mytest-m64.0000/cam4_s_base.mytest-m64"
# 628.pop2_s
cd /data/speccpu2017/benchspec/CPU/628.pop2_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "628.pop2_s" "../run_base_refspeed_mytest-m64.0000/speed_pop2_base.mytest-m64"
# 638.imagick_s
cd /data/speccpu2017/benchspec/CPU/638.imagick_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "638.imagick_s" "../run_base_refspeed_mytest-m64.0000/imagick_s_base.mytest-m64 -limit disk 0 refspeed_input.tga -resize 817% -rotate -2.76 -shave 540x375 -alpha remove -auto-level -contrast-stretch 1x1% -colorspace Lab -channel R -equalize +channel -colorspace sRGB -define histogram:unique-colors=false -adaptive-blur 0x5 -despeckle -auto-gamma -adaptive-sharpen 55 -enhance -brightness-contrast 10x10 -resize 30% refspeed_output.tga"
# 644.nab_s
cd /data/speccpu2017/benchspec/CPU/644.nab_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "644.nab_s" "../run_base_refspeed_mytest-m64.0000/nab_s_base.mytest-m64 3j1n 20140317 220"
# 649.fotonik3d_s
cd /data/speccpu2017/benchspec/CPU/649.fotonik3d_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "649.fotonik3d_s" "../run_base_refspeed_mytest-m64.0000/fotonik3d_s_base.mytest-m64"
# 654.roms_s
cd /data/speccpu2017/benchspec/CPU/654.roms_s/run/run_base_refspeed_mytest-m64.0000
run_with_perf_direct "654.roms_s" "../run_base_refspeed_mytest-m64.0000/sroms_base.mytest-m64 < ocean_benchmark3.in"
619运行脚本+perf自动取平均(脚本传入运行次数)
#/home/yangz//llvm/build/bin/clang -m64 -c -o lbm.o -DSPEC -DNDEBUG -DLARGE_WORKLOAD -O2 -mavx -mllvm --misched-topdown -DSPEC_OPENMP -fopenmp -Wno-return-type -DUSE_OPENMP -I /usr/lib/gcc/x86_64-redhat-linux/8/include/ -DSPEC_LP64 lbm.c
#/home/yangz//llvm/build/bin/clang -m64 -c -o main.o -DSPEC -DNDEBUG -DLARGE_WORKLOAD -O2 -mavx -mllvm --misched-topdown -DSPEC_OPENMP -fopenmp -Wno-return-type -DUSE_OPENMP -I /usr/lib/gcc/x86_64-redhat-linux/8/include/ -DSPEC_LP64 main.c
#!/bin/bash
# 检查参数
if [ "$#" -ne 1 ]; then
echo "Usage: $0 <number of runs>"
exit 1
fi
# 参数:执行次数
NUM_RUNS=$1
# 进入SPEC CPU测试目录
cd /home/speccpu2017/benchspec/CPU/619.lbm_s/run/run_base_test_mytest-m64.0000
# 初始化累加器
total_instructions=0
total_cycles=0
total_time=0
# 打印标题
echo "Run Number, Instructions, CPU Cycles, Time Elapsed (seconds)" > perf_results.txt
# 执行perf stat多次,并累加结果
for (( i=1; i<=NUM_RUNS; i++ ))
do
perf stat -e instructions,cpu-cycles ./lbm_s_base.mytest-m64 20 reference.dat 0 1 200_200_260_ldc.of 0<&- > lbm.out 2>perf_output.tmp
# 提取指令数、CPU周期数和时间
instructions=$(grep 'instructions' perf_output.tmp | awk '{print $1}' | tr -d ',')
cpu_cycles=$(grep 'cpu-cycles' perf_output.tmp | awk '{print $1}' | tr -d ',')
time_elapsed=$(grep 'seconds time elapsed' perf_output.tmp | awk '{print $1}')
# 累加结果
total_instructions=$(echo "$total_instructions + $instructions" | bc)
total_cycles=$(echo "$total_cycles + $cpu_cycles" | bc)
total_time=$(echo "$total_time + $time_elapsed" | bc)
# 输出当前运行的结果到文件
echo "$i, $instructions, $cpu_cycles, $time_elapsed" >> perf_results.txt
done
# 计算均值
avg_instructions=$(echo "scale=0; $total_instructions / $NUM_RUNS" | bc)
avg_cycles=$(echo "scale=0; $total_cycles / $NUM_RUNS" | bc)
avg_time=$(echo "scale=2; $total_time / $NUM_RUNS" | bc)
# 输出均值到文件
echo "Average, $avg_instructions, $avg_cycles, $avg_time" >> perf_results.txt
# 清理临时文件
rm perf_output.tmp
# 打印最终结果
cat perf_results.txt
#/home/yangz//llvm/build/bin/clang -m64 -O2 -mavx -mllvm --misched-topdown -z muldefs -DSPEC_OPENMP -fopenmp -Wno-return-type -DUSE_OPENMP -I /usr/lib/gcc/x86_64-redhat-linux/8/include/ lbm.o main.o -lm -fopenmp=libomp -L/usr/lib/gcc/x86_64-redhat-linux/8/include/ -lomp -o lbm_s
工具链使用
runcpu参数 | 解释 | |
---|---|---|
--reportable |
生成SPEC的报告结果 | |
—fakereport | 仅生成报告而不实际编译或运行 | |
—threads N | 设置运行时所用的线程数 | |
—tune=TUNE[,TUNE…] | 选择调优级别,如base、peak,影响优化和运行配置 | |
—output_format=FORMAT[,…] | 设定输出格式,如html、pdf、text等 | |
—debug LEVEL | 设置调试信息的详细程度 |
附录A——参考文献
- https://www.spec.org/cpu2017/Docs/index.html:测试题目说明
- https://www.spec.org/cpu2017/Docs/runrules.html:运行和报告规则
- https://www.spec.org/cpu2017/Docs/config.html#fieldname:配置文件详解
- https://www.spec.org/cpu2017/Docs/overview.html#benchmarks:SPEC2017概览
附录B——配置文件样例
#------------------------------------------------------------------------------
# SPEC CPU2017 config file for: LLVM / Linux / AMD64
#------------------------------------------------------------------------------
#
# Usage: (1) Copy this to a new name
# cd $SPEC/config
# cp Example-x.cfg myname.cfg
# (2) Change items that are marked 'EDIT' (search for it)
#
# SPEC tested this config file with:
# Compiler version(s): LLVM/3.9.0
# Operating system(s): Linux
# Hardware: AMD64
#
# If your system differs, this config file might not work.
# You might find a better config file at http://www.spec.org/cpu2017/results
#
# Compiler issues: Contact your compiler vendor, not SPEC.
# For SPEC help: http://www.spec.org/cpu2017/Docs/techsupport.html
#------------------------------------------------------------------------------
#--------- Label --------------------------------------------------------------
# Arbitrary string to tag binaries (no spaces allowed)
# Two Suggestions: # (1) EDIT this label as you try new ideas.
%define label mytest # (2) Use a label meaningful to *you*.
#--------- Preprocessor -------------------------------------------------------
%ifndef %{bits} # EDIT to control 32 or 64 bit compilation. Or,
% define bits 64 # you can set it on the command line using:
%endif # 'runcpu --define bits=nn'
%ifndef %{build_ncpus} # EDIT to adjust number of simultaneous compiles.
% define build_ncpus 36 # Or, you can set it on the command line:
%endif # 'runcpu --define build_ncpus=nn'
# Don't change this part.
%define os LINUX
%if %{bits} == 64
% define model -m64
%elif %{bits} == 32
% define model -m32
%else
% error Please define number of bits - see instructions in config file
%endif
%if %{label} =~ m/ /
% error Your label "%{label}" contains spaces. Please try underscores instead.
%endif
%if %{label} !~ m/^[a-zA-Z0-9._-]+$/
% error Illegal character in label "%{label}". Please use only alphanumerics, underscore, hyphen, and period.
%endif
#--------- Global Settings ----------------------------------------------------
# For info, see:
# https://www.spec.org/cpu2017/Docs/config.html#fieldname
# Example: https://www.spec.org/cpu2017/Docs/config.html#tune
#backup_config = 0 # Uncomment for cleaner config/ directory
flagsurl01 = $[top]/config/flags/gcc.xml
flagsurl02 = $[top]/config/flags/clang.xml
ignore_errors = 1
iterations = 1
label = %{label}-m%{bits}
line_width = 1020
log_line_width = 1020
makeflags = --jobs=%{build_ncpus}
mean_anyway = 1
output_format = txt,html,cfg,pdf,csv
preenv = $ENV{'PERF_COMMAND'} = "perf stat -e cycles,instructions,cache-references,cache-misses,branches,branch-misses -o /data/myrepo/perf_output.txt"
tune = base,peak
#--------- How Many CPUs? -----------------------------------------------------
# Both SPECrate and SPECspeed can test multiple chips / cores / hw threads
# - For SPECrate, you set the number of copies.
# - For SPECspeed, you set the number of threads.
# See: https://www.spec.org/cpu2017/Docs/system-requirements.html#MultipleCPUs
#
# q. How many should I set?
# a. Unknown, you will have to try it and see!
#
# To get you started:
#
# copies - This config file sets 1 copy per core (after you set the
# 'cpucores' variable, just below).
# Please be sure you have enough memory; if you do not, you might
# need to run a smaller number of copies. See:
# https://www.spec.org/cpu2017/Docs/system-requirements.html#memory
#
# threads - This config file sets a starting point. You can try adjusting it.
# Higher thread counts are much more likely to be useful for
# fpspeed than for intspeed.
#
#
# To do so, please adjust these; also adjust the 'numactl' lines, below
# EDIT to define system sizes
%define cpucores 20 # number of physical cores
%define cputhreads 40 # number of logical cores
%define numanodes 2 # number of NUMA nodes for affinity
intrate,fprate:
copies = 1 #%{cpucores}
intspeed,fpspeed:
threads = 1 #%{cputhreads}
#-------- CPU binding for rate -----------------------------------------------
# When you run multiple copies for SPECrate mode, performance
# is improved if you bind the copies to specific processors. EDIT the numactl stuff below.
intrate,fprate:
submit = echo "$command" > run.sh ; $BIND bash run.sh
# Affinity settings: EDIT this section
# Please adjust these values for your
# particular system as these settings are
# for an 8 core, one NUMA node (-m 0) system.
bind0 = numactl -m 0 --physcpubind=0
# NUMA node 0
bind0 = numactl -m 0 --physcpubind=0
bind1 = numactl -m 0 --physcpubind=1
bind2 = numactl -m 0 --physcpubind=2
bind3 = numactl -m 0 --physcpubind=3
bind4 = numactl -m 0 --physcpubind=4
bind5 = numactl -m 0 --physcpubind=5
bind6 = numactl -m 0 --physcpubind=6
bind7 = numactl -m 0 --physcpubind=7
bind8 = numactl -m 0 --physcpubind=8
bind9 = numactl -m 0 --physcpubind=9
bind10 = numactl -m 0 --physcpubind=20
bind11 = numactl -m 0 --physcpubind=21
bind12 = numactl -m 0 --physcpubind=22
bind13 = numactl -m 0 --physcpubind=23
bind14 = numactl -m 0 --physcpubind=24
bind15 = numactl -m 0 --physcpubind=25
bind16 = numactl -m 0 --physcpubind=26
bind17 = numactl -m 0 --physcpubind=27
bind18 = numactl -m 0 --physcpubind=28
bind19 = numactl -m 0 --physcpubind=29
# NUMA node 1
bind20 = numactl -m 1 --physcpubind=10
bind21 = numactl -m 1 --physcpubind=11
bind22 = numactl -m 1 --physcpubind=12
bind23 = numactl -m 1 --physcpubind=13
bind24 = numactl -m 1 --physcpubind=14
bind25 = numactl -m 1 --physcpubind=15
bind26 = numactl -m 1 --physcpubind=16
bind27 = numactl -m 1 --physcpubind=17
bind28 = numactl -m 1 --physcpubind=18
bind29 = numactl -m 1 --physcpubind=19
bind30 = numactl -m 1 --physcpubind=30
bind31 = numactl -m 1 --physcpubind=31
bind32 = numactl -m 1 --physcpubind=32
bind33 = numactl -m 1 --physcpubind=33
bind34 = numactl -m 1 --physcpubind=34
bind35 = numactl -m 1 --physcpubind=35
bind36 = numactl -m 1 --physcpubind=36
bind37 = numactl -m 1 --physcpubind=37
bind38 = numactl -m 1 --physcpubind=38
bind39 = numactl -m 1 --physcpubind=39
#------- Compilers ------------------------------------------------------------
default:
# EDIT paths to LLVM and libraries:
BASE_DIR = /data/yangz
# LLVM_PATH specifies the directory path containing required LLVM files and
# potentially multiple LLVM versions.
LLVM_PATH = $[BASE_DIR]/llvm-project
# LLVM_ROOT_PATH specifies the directory path to the LLVM version to be
# used. EDIT: Change llvm-v390 to appropriate directory name.
LLVM_ROOT_PATH = $[LLVM_PATH]/build
LLVM_BIN_PATH = $[LLVM_ROOT_PATH]/bin
LLVM_LIB_PATH = $[LLVM_ROOT_PATH]/lib
LLVM_INCLUDE_PATH = $[LLVM_ROOT_PATH]/include
DRAGONEGG_PATH = #$[LLVM_PATH]/dragonegg
DRAGONEGG_SPECS = #$[DRAGONEGG_PATH]/integrated-as.specs
# DragonEgg version 3.5.2 requires GCC version 4.8.2.
# EDIT LLVM_GCC_DIR to reflect the GCC path.
LLVM_GCC_DIR = /home/yangz/llvm/build/bin
GFORTRAN_DIR = /usr/bin
# Specify Intel OpenMP library path.
OPENMP_DIR = /usr/lib/gcc/x86_64-redhat-linux/8/include/
preENV_PATH = $[LLVM_BIN_PATH]:%{ENV_PATH}
CC = $(LLVM_BIN_PATH)/clang %{model}
CXX = $(LLVM_BIN_PATH)/clang++ %{model}
FORTRAN_COMP = $(GFORTRAN_DIR)/gfortran
FC = $(FORTRAN_COMP) %{model}
CLD = $(LLVM_BIN_PATH)/clang %{model}
FLD = $(FORTRAN_COMP) %{model}
# How to say "Show me your version, please"
CC_VERSION_OPTION = -v
CXX_VERSION_OPTION = -v
FC_VERSION_OPTION = -v
default:
%if %{bits} == 64
sw_base_ptrsize = 64-bit
sw_peak_ptrsize = 64-bit
%else
sw_base_ptrsize = 32-bit
sw_peak_ptrsize = 32-bit
%endif
intrate,intspeed: # 502.gcc_r and 602.gcc_s may need the
%if %{bits} == 32 # flags from this section. For 'base',
EXTRA_COPTIMIZE = -fgnu89-inline # all benchmarks must use the same
%else # options, so we add them to all of
LDCFLAGS = -z muldefs # integer rate and integer speed. See:
%endif # www.spec.org/cpu2017/Docs/benchmarks/502.gcc_r.html
#--------- Portability --------------------------------------------------------
default:# data model applies to all benchmarks
%if %{bits} == 32
# Strongly recommended because at run-time, operations using modern file
# systems may fail spectacularly and frequently (or, worse, quietly and
# randomly) if a program does not accommodate 64-bit metadata.
EXTRA_PORTABILITY = -D_FILE_OFFSET_BITS=64
%else
EXTRA_PORTABILITY = -DSPEC_LP64
%endif
# Benchmark-specific portability (ordered by last 2 digits of bmark number)
500.perlbench_r,600.perlbench_s: #lang='C'
%if %{bits} == 32
% define suffix IA32
%else
% define suffix X64
%endif
PORTABILITY = -DSPEC_%{os}_%{suffix}
521.wrf_r,621.wrf_s: #lang='F,C'
CPORTABILITY = -DSPEC_CASE_FLAG
FPORTABILITY = -fconvert=big-endian
523.xalancbmk_r,623.xalancbmk_s: #lang='CXX'
PORTABILITY = -DSPEC_%{os}
526.blender_r: #lang='CXX,C'
CPORTABILITY = -funsigned-char
CXXPORTABILITY = -D__BOOL_DEFINED
527.cam4_r,627.cam4_s: #lang='F,C'
PORTABILITY = -DSPEC_CASE_FLAG
628.pop2_s: #lang='F,C'
CPORTABILITY = -DSPEC_CASE_FLAG
FPORTABILITY = -fconvert=big-endian
#-------- Baseline Tuning Flags ----------------------------------------------
default=base:
strict_rundir_verify = 0
COPTIMIZE = -O2 -mavx -mllvm --regalloc=basic
CXXOPTIMIZE = -O2 -mavx -std=c++14
FOPTIMIZE = -O2 -mavx -funroll-loops
#EXTRA_FFLAGS = -fplugin=$(DRAGONEGG_PATH)/dragonegg.so -Wno-register -std=c++14
EXTRA_FFLAGS = -Wno-register -std=c++14
EXTRA_FLIBS = -L/usr/lib64 -lgfortran -lm
LDOPTIMIZE = -z muldefs -no-pie
intrate,fprate:
preENV_LIBRARY_PATH = $[LLVM_LIB_PATH]
preENV_LD_LIBRARY_PATH = $[LLVM_LIB_PATH]
#preENV_LIBRARY_PATH = $[LLVM_LIB_PATH]:%{ENV_LIBRARY_PATH}
#preENV_LD_LIBRARY_PATH = $[LLVM_LIB_PATH]:%{ENV_LD_LIBRARY_PATH}
#
# Speed (OpenMP and Autopar allowed)
#
%if %{bits} == 32
intspeed,fpspeed:
#
# Many of the speed benchmarks (6nn.benchmark_s) do not fit in 32 bits
# If you wish to run SPECint2017_speed or SPECfp2017_speed, please use
#
# runcpu --define bits=64
#
fail_build = 1
%else
intspeed,fpspeed:
OPENMP_LIB_PATH = $[OPENMP_DIR]
EXTRA_OPTIMIZE = -DSPEC_OPENMP -fopenmp -Wno-return-type -DUSE_OPENMP -I $(OPENMP_DIR)
EXTRA_LIBS = -fopenmp -L$(OPENMP_LIB_PATH) -lomp
EXTRA_FLIBS = -fopenmp -L/usr/lib64 -lgfortran -lm
preENV_LIBRARY_PATH = $[LLVM_LIB_PATH]:$[OPENMP_LIB_PATH]
preENV_LD_LIBRARY_PATH = $[LLVM_LIB_PATH]:$[OPENMP_LIB_PATH]
#preENV_LIBRARY_PATH = $[LLVM_LIB_PATH]:$[OPENMP_LIB_PATH]:%{ENV_LIBRARY_PATH}
#preENV_LD_LIBRARY_PATH = $[LLVM_LIB_PATH]:$[OPENMP_LIB_PATH]:%{ENV_LD_LIBRARY_PATH}
preENV_OMP_THREAD_LIMIT = %{cputhreads}
preENV_OMP_STACKSIZE = 128M
preENV_GOMP_CPU_AFFINITY = 0-%{cputhreads}
%endif
#-------- Peak Tuning Flags ----------------------------------------------
default=peak:
strict_rundir_verify = 0
COPTIMIZE = -Ofast -mavx
CXXOPTIMIZE = -O3 -mavx -std=c++14
EXTRA_FLIBS = -lgfortran -lm
FOPTIMIZE = -Ofast -mavx -funroll-loops -fno-stack-arrays
#EXTRA_FFLAGS = -fplugin=$(DRAGONEGG_PATH)/dragonegg.so -Wno-register -std=c++14
EXTRA_FFLAGS = -Wno-register -std=c++14
502.gcc_r,602.gcc_s=peak: #lang='C'
LDOPTIMIZE = -z muldefs
521.wrf_r,621.wrf_s=peak: #lang='F,C'
COPTIMIZE = -O3 -freciprocal-math -ffp-contract=fast -mavx
EXTRA_FLIBS = -lgfortran -lm
FOPTIMIZE = -O3 -freciprocal-math -ffp-contract=fast -mavx -funroll-loops
#EXTRA_FFLAGS = -fplugin=$(DRAGONEGG_PATH)/dragonegg.so
523.xalancbmk_r,623.xalancbmk_s=peak: #lang='CXX
CXXOPTIMIZE = -O3 -mavx
#------------------------------------------------------------------------------
# Tester and System Descriptions - EDIT all sections below this point
#------------------------------------------------------------------------------
# For info about any field, see
# https://www.spec.org/cpu2017/Docs/config.html#fieldname
# Example: https://www.spec.org/cpu2017/Docs/config.html#hw_memory
#-------------------------------------------------------------------------------
#--------- EDIT to match your version -----------------------------------------
default:
sw_compiler001 = C/C++: Version 17.0.6 of Clang, the
sw_compiler002 = LLVM Compiler Infrastructure
sw_compiler003 = Fortran: Version 8.5.0 of GCC, the
sw_compiler004 = GNU Compiler Collection
sw_compiler005 = DragonEgg: Version 3.5.2, the
sw_compiler006 = LLVM Compiler Infrastructure
#--------- EDIT info about you ------------------------------------------------
# To understand the difference between hw_vendor/sponsor/tester, see:
# https://www.spec.org/cpu2017/Docs/config.html#test_sponsor
intrate,intspeed,fprate,fpspeed: # Important: keep this line
hw_vendor = YLQH
tester = My Corporation
test_sponsor = My Corporation
license_num = nnn (Your SPEC license number)
# prepared_by = # Ima Pseudonym # Whatever you like: is never output
#--------- EDIT system availability dates -------------------------------------
intrate,intspeed,fprate,fpspeed: # Important: keep this line
# Example # Brief info about field
hw_avail = # Nov-2099 # Date of LAST hardware component to ship
sw_avail = # Nov-2099 # Date of LAST software component to ship
#--------- EDIT system information --------------------------------------------
intrate,intspeed,fprate,fpspeed: # Important: keep this line
# Example # Brief info about field
# hw_cpu_name = # Intel Xeon E9-9999 v9 # chip name
hw_cpu_nominal_mhz = # 9999 # Nominal chip frequency, in MHz
hw_cpu_max_mhz = # 9999 # Max chip frequency, in MHz
# hw_disk = # 9 x 9 TB SATA III 9999 RPM # Size, type, other perf-relevant info
hw_model = # TurboBlaster 3000 # system model name
# hw_nchips = # 99 # number chips enabled
hw_ncores = # 9999 # number cores enabled
hw_ncpuorder = # 1-9 chips # Ordering options
hw_nthreadspercore = # 9 # number threads enabled per core
hw_other = # TurboNUMA Router 10 Gb # Other perf-relevant hw, or "None"
# hw_memory001 = # 999 GB (99 x 9 GB 2Rx4 PC4-2133P-R, # The 'PCn-etc' is from the JEDEC
# hw_memory002 = # running at 1600 MHz) # label on the DIMM.
hw_pcache = # 99 KB I + 99 KB D on chip per core # Primary cache size, type, location
hw_scache = # 99 KB I+D on chip per 9 cores # Second cache or "None"
hw_tcache = # 9 MB I+D on chip per chip # Third cache or "None"
hw_ocache = # 9 GB I+D off chip per system board # Other cache or "None"
fw_bios = # American Megatrends 39030100 02/29/2016 # Firmware information
# sw_file = # ext99 # File system
# sw_os001 = # Linux Sailboat # Operating system
# sw_os002 = # Distribution 7.2 SP1 # and version
sw_other = # TurboHeap Library V8.1 # Other perf-relevant sw, or "None"
# sw_state = # Run level 99 # Software state.
# Note: Some commented-out fields above are automatically set to preliminary
# values by sysinfo
# https://www.spec.org/cpu2017/Docs/config.html#sysinfo
# Uncomment lines for which you already know a better answer than sysinfo
附录C——Q&A
运行时间要多久?
公制 | 配置测试 | 个人 基准测试 | 全面运行 (可报告) |
---|---|---|---|
SPECrate 2017 整数 | 1 份 | 6至10分钟 | 2.5 小时 |
SPECrate 2017 浮点 | 1 份 | 5 至 36 分钟 | 4.8 小时 |
SPECspeed 2017 整数 | 4 个线程 | 6至15分钟 | 3.1 小时 |
SPECspeed 2017 浮点 | 16 个线程 | 6 至 75 分钟 | 4.7 小时 |
一个使用 2016 年系统的任意示例。 选择了 2 次迭代,仅基础,无峰值。不包括编译时间。 |
整个基础套件概览
类型 | 套件 | 内容 | 指标 | 有多少份? 分数越高意味着什么? |
---|---|---|---|---|
intspeed | SPECspeed® 2017 整数 | 10 个整数基准测试 | SPECspeed®2017_int_base SPECspeed®2017_int_peak SPECspeed®2017_int_energy_base SPECspeed®2017_int_energy_peak |
SPECspeed 套件始终运行每 个基准测试的一个副本。 分数越高,所需时间越少。 |
fpspeed | SPECspeed®2017 浮点 | 10 个浮点基准测试 | SPECspeed®2017_fp_base SPECspeed®2017_fp_peak SPECspeed®2017_fp_energy_base SPECspeed®2017_fp_energy_peak |
|
intrate | SPECrate® 2017 整数 | 10 个整数基准测试 | SPECrate®2017_int_base SPECrate®2017_int_peak SPECrate®2017_int_energy_base SPECrate®2017_int_energy_peak |
SPECrate 套件会同时运行每 个基准测试的多个副本。 测试人员可选择运行多少个副本。 分数越高,吞吐量 (单位时间内的工作量)越大。 |
fprate | SPECrate® 2017浮点 | 13个浮点基准测试 | SPECrate®2017_fp_base SPECrate®2017_fp_peak SPECrate®2017_fp_energy_base SPECrate®2017_fp_energy_peak |
测试题目概览
==SPECrate®2017 Integer== | ==SPECspeed®2017 Integer== | Language[1] | KLOC[2] | Application Area |
---|---|---|---|---|
500.perlbench_r | 600.perlbench_s | C | 362 | Perl interpreter |
502.gcc_r | 602.gcc_s | C | 1,304 | GNU C compiler |
505.mcf_r | 605.mcf_s | C | 3 | Route planning |
520.omnetpp_r | 620.omnetpp_s | C++ | 134 | Discrete Event simulation - computer network |
523.xalancbmk_r | 623.xalancbmk_s | C++ | 520 | XML to HTML conversion via XSLT |
525.x264_r | 625.x264_s | C | 96 | Video compression |
531.deepsjeng_r | 631.deepsjeng_s | C++ | 10 | Artificial Intelligence: alpha-beta tree search (Chess) |
541.leela_r | 641.leela_s | C++ | 21 | Artificial Intelligence: Monte Carlo tree search (Go) |
548.exchange2_r | 648.exchange2_s | Fortran | 1 | Artificial Intelligence: recursive solution generator (Sudoku) |
557.xz_r | 657.xz_s | C | 33 | General data compression |
==SPECrate®2017 FP== | ==SPECspeed®2017 FP== | Language | KLOC | Application Area |
---|---|---|---|---|
503.bwaves_r | 603.bwaves_s | Fortran | 1 | Explosion modeling |
507.cactuBSSN_r | 607.cactuBSSN_s | C++, C, Fortran | 257 | Physics: relativity |
508.namd_r | C++ | 8 | Molecular dynamics | |
510.parest_r | C++ | 427 | Biomedical imaging: optical tomography with finite elements | |
511.povray_r | C++, C | 170 | Ray tracing | |
519.lbm_r | 619.lbm_s | C | 1 | Fluid dynamics |
521.wrf_r | 621.wrf_s | Fortran, C | 991 | Weather forecasting |
526.blender_r | C++, C | 1,577 | 3D rendering and animation | |
527.cam4_r | 627.cam4_s | Fortran, C | 407 | Atmosphere modeling |
628.pop2_s | Fortran, C | 338 | Wide-scale ocean modeling (climate level) | |
538.imagick_r | 638.imagick_s | C | 259 | Image manipulation |
544.nab_r | 644.nab_s | C | 24 | Molecular dynamics |
549.fotonik3d_r | 649.fotonik3d_s | Fortran | 14 | Computational Electromagnetics |
554.roms_r | 654.roms_s | Fortran | 210 | Regional ocean modeling |
- 对于多语言benchmark,第一个语言决定库和链接选项
- KLOC = 构建中使用的源文件的行数(包括注释/空格)/1000(即代码量)
源码目录的5nn.benchmark和6nn.benchmark有点什么不同?
- 5nn.benchmark_r 是 SPECrate 版本
6nn.benchmark_s 是 SPECspeed 版本 - 差异包括:工作负载大小、编译标志和运行规则[memory] [OpenMP] [rules]
- 细节区别:
①工作负载通常不同。 对于 SPECrate,您可以选择32位或64位编译;对于 SPECspeed,通常需要64位(-m64)
②OpenMP 指令:SPECrate 在构建 SPECrate 基准测试时从不使用;
对所有 SPECspeed 2017 浮点基准测试和一个 SPECspeed 2017 整数基准测试 657.xz_s 是可选的OpenMP
③==对于 SPECrate,禁止编译器并行化——包括 OpenMP 和编译器自动并行化==
④其他:一些对在编译标志上启用了不同的源代码。
SPECspeed & SPECrate 衡量标准
- 时间 - 例如,完成工作量所需的秒数。
- 吞吐量——单位时间内完成的工作,例如每小时的工作量。【详见官网】
应该是使用speed还是rate?
- 运行各种通用桌面程序的单个用户也许会对 SPECspeed2017_int_base 感兴趣。
- 一组运行定制建模程序的科学家也许会对 SPECrate2017_fp_peak 感兴趣。
SPEC2017比2006升级之处
【见官网】
感兴趣。
SPEC2017比2006升级之处
【见官网】