shell命令行并行神奇 – parallel

概述

GNU parallel一个 shell 工具用于使用一台多台计算机并行执行作业作业可以是单个命令或必须为输入中的每一行运行的小脚本。典型的输入是文件列表主机列表用户列表、URL 列表或表列表。作业也可以是从管道读取命令。 GNU parallel 然后可以拆分输入并将其通过管道并行传输到命令中。

基本语法

熟悉xargs同学这个应该理解起来很快

1、生成五个文件重定向输入

seq 5 | parallel seq {} '>' example.{}
# 回忆一下for 循环怎么写来着
# for i in `seq 5`;do echo `seq $i` > example-for.$i;done

2、parallel的输入

::: 后面跟的是其从命令行的输入

  • parallel echo ::: 1 2 3 4 5

输出

1
2
3
4
5

输入是文件名

1 1 2 example.1
2 2 4 example.2
3 3 6 example.3
4 4 8 example.4
5 5 10 example.5

wc 默认输出解释

wc example.3
3 3 6 example.3
#行数 单词字节文件名
  • parallel echo ::: S M L ::: Green Red

多个::: 输入,输出排列组合

S Green
S Red
M Green
M Red
L Green
L Red

parallel从标准输入读取

File example.1
File example.2
File example.3
File example.4
File example.5

3、和命令行结合

# parallel echo counting lines';' wc -l ::: example.*
counting lines
1 example.1
counting lines
2 example.2
counting lines
3 example.3
counting lines
4 example.4
counting lines
5 example.5

用{}进行字符替换这个是不是和xargs 很像

parallel echo test lines';' wc -l ::: example.*
test example.1
1 example.1
test example.2
2 example.2
test example.4
4 example.4
test example.3
3 example.3
test example.5
5 example.5

当有多个输入的时候,使用{1} {2}

例如需要分别统计example.*中的行数字节

# parallel echo count {1} in {2}';' wc {1} {2} ::: -l -c ::: example.*
count -l in example.1
1 example.1
count -l in example.2
2 example.2
count -l in example.3
3 example.3
count -l in example.4
4 example.4
count -l in example.5
5 example.5
count -c in example.1
2 example.1
count -c in example.2
4 example.2
count -c in example.3
6 example.3
count -c in example.4
8 example.4
count -c in example.5
10 example.5

dryrun 测试

# parallel --dry-run echo count {1} {2} ';' wc {1} {2} ::: -c -l ::: example.*
# 看这个结果已经不是顺序得了
echo count -c example.1 ; wc -c example.1
echo count -c example.2 ; wc -c example.2
echo count -c example.3 ; wc -c example.3
echo count -c example.5 ; wc -c example.5
echo count -c example.4 ; wc -c example.4
echo count -l example.1 ; wc -l example.1
echo count -l example.2 ; wc -l example.2
echo count -l example.3 ; wc -l example.3
echo count -l example.4 ; wc -l example.4

4、输出

5、并行数

当然这个并行的,并行数设置多少合适呢?

默认值是和你的oscores相同。一般为了限制parallel占据所有的cpu资源建议使用 –jobs限制并发数,作为脚本参数输入比较常见

–jobs 0 竟可能多的并行

测试

# 并行为1,理论上就是5+4+3+2+1 =15 s
time parallel --jobs 1 sleep {}';' echo {} done ::: 5 4 3 1 2
# 并行为0,取决于最慢的那个sleep
time parallel --jobs 0 sleep {}';' echo {} done ::: 5 4 3 1 2

如果是五个job

Job slot 1: 55555
Job slot 2: 4444
Job slot 3: 333
Job slot 4: 1
Job slot 5: 22

6、处理文本数据

数据传递标准输入上

#seq 1000000 | parallel --pipe wc
 165668  165668 1048571
 149796  149796 1048572
 149796  149796 1048572
 149796  149796 1048572
 149796  149796 1048572
 149796  149796 1048572
  85352   85352  597465

大约 1 MB 的块传递每个作业

1mb行数字符数、字节

实战 并发docker run

并行启动dokcer 容器进行redis key迁移,效能大幅度提升。

通过以下脚本可以体会到 parallel的魅力:

#!/bin/bash
# date 2023年2月9日17:57:20
# author ninesun
# desc parallel docker run  
set -e
set -o pipefail
# 获取程序绝对路径
SCRIPT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/$(basename "${BASH_SOURCE[0]}")"
# parallel 并行数量
JOBS=${JOBS:-5}

ERRORS="$(pwd)/errors"
INFO="$(pwd)/info"

dockerrun() {
	f=$1
    docker rm -f redis-img-${f}
    # echo ${f}
    docker run  --name redis-img-${f} 10.50.10.185/harbortest/redis-mig:1.2 python3 redisMigrate.py 10.50.10.45 19000 10.50.10.170 7100 
     ${f} ::: "${files[@]}" >  $INFO/dockerrun.log

}
echo
echo

main(){
	# get the indexfile
	IFS=$'n'
	mapfile -t files < <(find ./ -name "st*.txt.*" -o -name "line*.txt.*" |sed 's|./||'| sort)
	unset IFS
	# docker run all jobs
	echo "Running in parallel with ${JOBS} jobs."
	# 开启$jobs 个 /opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun xxx.txt
	parallel --tag --verbose --ungroup -j"${JOBS}" "$SCRIPT" dockerrun {1} ::: "${files[@]}"

	if [[ ! -f "$ERRORS" ]]; then
		echo "No errors, hooray!"
	else
		echo "[ERROR] Some images did not build correctly, see below." >&2
		echo "These images failed: $(cat "$ERRORS")" >&2
		exit 1
	fi
}

run(){
	args=$*
	f=$1

	if [[ "$f" == "" ]]; then
		main "$args"
	else
		$args
	fi
}

run "$@"

10个并发测试

./mig-v2.sh



Running in parallel with 10 jobs.
Academic tradition requires you to cite works you base your article on.
When using programs that use GNU Parallel to process data for publication
please cite:

  O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
  ;login: The USENIX Magazine, February 2011:42-47.

This helps funding further development; AND IT WON'T COST YOU A CENT.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

To silence the citation notice: run 'parallel --bibtex'.

/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.000
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.001
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.002
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.003


lineurl.txt.000
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.004
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.005


lineurl.txt.001


lineurl.txt.002
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.006
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.007
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.008


lineurl.txt.004


lineurl.txt.003
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.009


lineurl.txt.008


lineurl.txt.005


lineurl.txt.007


lineurl.txt.006


lineurl.txt.009
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.010


lineurl.txt.010
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun startline.txt.000


startline.txt.000
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun startline.txt.001


startline.txt.001
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun startline.txt.002


startline.txt.002
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun startline.txt.003


startline.txt.003
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun startline.txt.004


startline.txt.004
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun startline.txt.005


startline.txt.005
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun startline.txt.006


startline.txt.006
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun startline.txt.007


startline.txt.007
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun startline.txt.008


startline.txt.008
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun startline.txt.009


startline.txt.009
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun startline.txt.010


startline.txt.010
No errors, hooray!

10个线程 压测性能

2g2c 本地拉起的虚拟机

10mins迁移完成
在这里插入图片描述

 ]# bash -x mig-v2.sh
+ set -e
+ set -o pipefail
+++ dirname mig-v2.sh
++ cd .
++ pwd
++ basename mig-v2.sh
+ SCRIPT=/opt/redis-mig/redis_key_mig/mig-v2.sh
+ JOBS=1
++ pwd
+ ERRORS=/opt/redis-mig/redis_key_mig/errors
++ pwd
+ INFO=/opt/redis-mig/redis_key_mig/info
+ echo

+ echo

+ run
+ args=
+ f=
+ [[ '' == '' ]]
+ main ''
+ IFS='
'
+ mapfile -t files
++ find ./ -name 'st*.txt.*' -o -name 'line*.txt.*'
++ sed 's|./||'
++ sort
+ unset IFS
+ echo

+ echo 'Running in parallel with 1 jobs.'
Running in parallel with 1 jobs.
+ parallel --tag --verbose --ungroup -j1 /opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun '{1}' ::: lineurl.txt.000 lineurl.txt.001 lineurl.txt.002 lineurl.txt.003 lineurl.txt.004 lineurl.txt.005 lineurl.txt.006 lineurl.txt.007 lineurl.txt.008 lineurl.txt.009 lineurl.txt.010 startline.txt.000 startline.txt.001 startline.txt.002 startline.txt.003 startline.txt.004 startline.txt.005 startline.txt.006 startline.txt.007 startline.txt.008 startline.txt.009 startline.txt.010
Academic tradition requires you to cite works you base your article on.
When using programs that use GNU Parallel to process data for publication
please cite:

  O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
  ;login: The USENIX Magazine, February 2011:42-47.

This helps funding further development; AND IT WON'T COST YOU A CENT.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

To silence the citation notice: run 'parallel --bibtex'.

/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.000


lineurl.txt.000
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.001


lineurl.txt.001
/opt/redis-mig/redis_key_mig/mig-v2.sh dockerrun lineurl.txt.002

参考

GNU_Parallel_2018.pdf

https://www.gnu.org/software/parallel/

原文地址:https://blog.csdn.net/MyySophia/article/details/128966914

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任

如若转载,请注明出处:http://www.7code.cn/show_31684.html

如若内容造成侵权/违法违规/事实不符,请联系代码007邮箱suwngjj01@126.com进行投诉反馈,一经查实,立即删除

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注