文章/答案/技术大牛

发布

社区首页 >问答首页 >在匹配字符串的前8行前添加"#“

问在匹配字符串的前8行前添加"#“
EN

Stack Overflow用户

提问于 2015-04-29 01:55:47

回答 4查看 35关注 0票数 3

这个问题有点让人困惑，所以我只举一个例子。

假设我有以下情况：

$ grep -P "locus_tag\tM715_1000193188" Genome.tbl -B1 -A8
193188  193066  gene
            locus_tag   M715_1000193188
193188  193066  mRNA
            product hypothetical protein
            protein_id  gnl|CorradiLab|M715_1000193188
            transcript_id   gnl|CorradiLab|M715_mrna1000193188
193188  193066  CDS
        product hypothetical protein
        protein_id  gnl|CorradiLab|M715_1000193188
        transcript_id   gnl|CorradiLab|M715_mrna1000193188

我想在"locus_tag M715_1000193188“后面的8行中添加"#”，这样修改后的文件将如下所示：

193188  193066  gene
            locus_tag   M715_1000193188
#193188 193066  mRNA
#           product hypothetical protein
#           protein_id  gnl|CorradiLab|M715_1000193188
#           transcript_id   gnl|CorradiLab|M715_mrna1000193188
#193188 193066  CDS
#       product hypothetical protein
#       protein_id  gnl|CorradiLab|M715_1000193188
#       transcript_id   gnl|CorradiLab|M715_mrna1000193188

基本上，我有一个包含大约3000个不同locus标签的文件，对于其中的300个，我需要注释掉mRNA和CDS功能，所以locus_tag行后面的8行代码。

有没有可能用sed来做到这一点？文件中还有其他类型的信息需要保持不变。

谢谢你，禤浩焯

awk

sed

text-parsing

回答 4

Stack Overflow用户

发布于 2015-04-29 02:01:42

如果您可以使用awk，则应执行以下操作：

awk 'f&&f-- {$0="#"$0} /locus_tag/ {f=8} 1' file
193188  193066  gene
            locus_tag   M715_1000193188
#193188  193066  mRNA
#            product hypothetical protein
#            protein_id  gnl|CorradiLab|M715_1000193188
#            transcript_id   gnl|CorradiLab|M715_mrna1000193188
#193188  193066  CDS
#        product hypothetical protein
#        protein_id  gnl|CorradiLab|M715_1000193188
#        transcript_id   gnl|CorradiLab|M715_mrna1000193188

票数 3

Stack Overflow用户

发布于 2015-04-29 02:08:44

sed支持range Addresses，可以在这里做你想做的事情。

sed -e '/locus_tag\tM715_1000193188/,+8s/^/#/' file

如注释中所述，此范围地址格式是特定于GNU sed的。

票数 1

Stack Overflow用户

发布于 2015-04-29 02:21:58

$ cat tst.awk
BEGIN { split(tags,tmp); for (i in tmp) tagsA[tmp[i]] }
c&&c-- { $0 = "#" $0 }
($(NF-1) == "locus_tag") && ($NF in tagsA) { c=8 }
{ print }

$ awk -v tags="M715_1000193188 M715_1000193189 M715_1000193190" -f tst.awk file
193188  193066  gene
            locus_tag   M715_1000193188
#193188  193066  mRNA
#            product hypothetical protein
#            protein_id  gnl|CorradiLab|M715_1000193188
#            transcript_id   gnl|CorradiLab|M715_mrna1000193188
#193188  193066  CDS
#        product hypothetical protein
#        protein_id  gnl|CorradiLab|M715_1000193188
#        transcript_id   gnl|CorradiLab|M715_mrna1000193188

只需列出你关心的所有300个轨迹标记值，如上面的3个示例所示。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/29926593

复制

相似问题

问在匹配字符串的前8行前添加"#“
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在匹配字符串的前8行前添加"#“EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在匹配字符串的前8行前添加"#“
EN