
输入文件包含:
文件1
SRR513804.1218581HWI-ST695_116193610:4:1307:17513:49120 SRR513804.16872HWI ST695_116193610:4:1101:7150:72196 SRR513804.2106179HWI-ST695_116193610:4:2206:10596:165949 SRR513804.1710546HWI-ST695_116193610:4:2107:13906:128004 SRR513804.544253
文件2
>SRR513804.1218581HWI-ST695_116193610:4:1307:17513:49120TTTTGTTTTTTCTATATTTGAAAAAGAAATATGAAAACTTCATTTATATTTTCCACAAAGAATGATTCAGCATCCTTCAAAGAAATTCAATATGTATAAAACGGTAATTCTAAATTTTATACATATTGAATTTCTTTGAAGGATGCTGAATCATTCTTTGTGGAAAATATAAATGAAGTTTTCATATTTCTTTTTCAAAT
要解析第一个文件,我这样做:
awk '{ s = NF center = }{ printf "%s\t %d\n",center,s}' file1 要解析第二个文件,我这样做:
awk '/^>/ { if (count != "") printf "%s\t %d\n",seq_ID,count count = 0 seq_ID = NR==FNR { s[NR] = NF center[NR] = next}/^>/ { seq_ID[++y] = $cat file1SRR513804.1218581HWI-ST695_116193610:4:1307:17513:49120 SRR513804.16872HWI ST695_116193610:4:1101:7150:72196 SRR513804.2106179HWI-ST695_116193610:4:2206:10596:165949 SRR513804.1710546HWI-ST695_116193610:4:2107:13906:128004 SRR513804.544253$cat file2>SRR513804.1218581HWI-ST695_116193610:4:1307:17513:49120TTTTGTTTTTTCTATATTTGAAAAAGAAATATGAAAACTTCATTTATATTTTCCACAAAGAATGATTCAGCATCCTTCAAAGAAATTCAATATGTATAAAACGGTAATTCTAAATTTTATACATATTGAATTTCTTTGAAGGATGCTGAATCATTCTTTGTGGAAAATATAAATGAAGTTTTCATATTTCTTTTTCAAAT$awk -f script.awk file1 file2SRR513804.1218581HWI-ST695_116193610:4:1307:17513:49120 4 200ST695_116193610:4:2206:10596:165949 3 0
++i next}NF { long[i] += length()}END { for(x=1;x<=length(s);x++) { printf "%s\t %d\t %d\n",center[x],s[x],long[x] }} next}NF { long = length() count = count+long}END{ if (count != "") printf "%s\t %d\n",count}' file2 我的临时解决方案是在第二步中创建一个时间和覆盖.获得此输出有一种更“优雅”的方式吗?
解决方法 我对这个要求并不完全清楚,如果你能更新问题,我们可以帮助改进答案.但是,根据我收集的内容,您希望总结两个文件的输出.我假设两个文件中的内容按顺序排列.如果不是这样,那么我们将不得不在打印摘要时添加其他检查.script.awk的内容(重用大部分现有代码):
测试:
总结以上是内存溢出为你收集整理的Awk将处理两个文件的结果合并为一个文件全部内容,希望文章能够帮你解决Awk将处理两个文件的结果合并为一个文件所遇到的程序开发问题。
如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。
欢迎分享,转载请注明来源:内存溢出
微信扫一扫
支付宝扫一扫
评论列表(0条)