首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何从多个文件中删除模式

如何从多个文件中删除模式
EN

Stack Overflow用户
提问于 2016-10-31 23:21:03
回答 3查看 193关注 0票数 2

这是我的档案。

代码语言:javascript
复制
...
</script>

<!--START: Google Analytics --->
<script type="text/javascript"
src="../src/goog/ga_body.js"></script>
<!--END: Google Analytics --->
</body>
</html>
...

如何包括删除<!--START: Google Analytics ---><!--END: Google Analytics --->的所有内容?因此,有效地:

代码语言:javascript
复制
<!--START: Google Analytics --->
<script type="text/javascript"
src="../src/goog/ga_body.js"></script>
<!--END: Google Analytics --->

都会消失。剩下的就是

代码语言:javascript
复制
</script>

    <nothing here 4 lines deleted>

    </body>
    </html>

我正考虑用bash来做这件事,所以sed和awk可能是我最好的选择,尽管python可能更好。

EDIT1

这是我以前写过的东西,但是它可能是非常糟糕的编码,我将处理这个find2PatternsAndDeleteTextInBetween.sh

代码语言:javascript
复制
#HEre I want to find 2 patterns and delete whats in between 
#this example works 


#this is the 2 patterns I want to fine Start and End
#have to use some escape characters here for this to show properly
# have to use \n for it to appear in this format 
#<!-- Start of StatCounter Code for DoYourOwnSite -->
#  text would go here 
#<!-- End of StatCounter Code for DoYourOwnSite -->>

#b="<!-- Start of StatCounter Code for DoYourOwnSite -->"

#b2="<!-- End of StatCounter Code for DoYourOwnSite -->"

#p1="PATTERN-1"
#p2="PATTERN-2"
p1="<!-- Start of StatCounter Code for DoYourOwnSite -->"
p2="<!-- End of StatCounter Code for DoYourOwnSite -->"
fname="*.html"
num_of_files_pattern1=ls #grep $p1 fname


echo "fname(s) to apply the sed to:"
echo $fname
echo "num_of_files_pattern1 is:"
echo $num_of_files_pattern1

echo "Pattern1 is equal to:"
echo $p1

echo "Pattern2 is equal to:"
echo $p2

#this is current dir where the script is
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
echo "DIR is equal to:"
echo $DIR

#cd to the dir where I want to copy the files to:
cd "$DIR"

# this will find the pattern <\head> in all the .html files and place "This should appear before the closing head tag" this before it
# it will also make a backup with .bak extension 
#sed -i.bak '/<\\head>/i\This should appear before the closing head tag' *.html

echo "sed on the file"
# this does the head part
#sed '/PATTERN-1/,/PATTERN-2/d' *.txt # this works
#sed "/$p1/,/$p2/d" *.txt # this works
#sed "/$p1/,/$p2/d" $fname # this works 
sed -i.bak "/$p1/,/$p2/d" $fname # this works 

EDIT2

这就是我最后得出的结论,但下面有一个更有力的答案:

代码语言:javascript
复制
# ------------------------------------------------------------------
# [author] find2PatternsAndDeleteTextInBetween.sh
#           Description
#           Here I want to find 2 patterns and delete what's in between 
#           this example works 
#
# EXAMPLE:
# this is the 2 patterns I want to find Start and End
# <!-- Start of StatCounter Code for DoYourOwnSite -->
#   text would go here 
# <!-- End of StatCounter Code for DoYourOwnSite -->>
#
# ------------------------------------------------------------------
p1="<!--START: Google Analytics --->"
p2="<!--END: Google Analytics --->"
fname=".html"
echo "fname(s) to apply the sed to:"
echo *"$fname"
echo -e "\n"
echo "Pattern1 is equal to:"
echo -e "$p1\n"
echo "Pattern2 is equal to:"
echo -e "$p2\n"
echo -e "PWD is: $PWD\n"
echo "sed on the file"
#sed '/PATTERN-1/,/PATTERN-2/d' *.txt # this works
#sed "/$p1/,/$p2/d" *.txt # this works
#sed "/$p1/,/$p2/d" $fname # this works 
sed -i.bak "/$p1/,/$p2/d" *"$fname" # this works 
EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2016-11-01 01:52:31

从您问题中的脚本判断,听起来您已经知道如何使用sed从单个文件(sed -i.bak "/$p1/,/$p2/d" $fname)中删除感兴趣的范围,但是正在寻找a 健壮的方法来处理E 29E 110E 211E 112脚本>E 213<代码>E 114(假设E 215E 114)

代码语言:javascript
复制
#!/usr/bin/env bash

# cd to the dir. in which this script is located.
# CAVEAT: Assumes that the script wasn't invoked through a *symlink*
#         located in a different dir.
cd -- "$(dirname -- "$BASH_SOURCE")" || exit

fpattern='*.html'     # specify source-file globbing pattern
shopt -s failglob     # make sure that globbing expands to nothing if nothing matches
fnames=( $fpattern )  # expand to matching files and store in array 
num_of_files_matching_pattern=${#fnames[@]} # count matching files
(( num_of_files_matching_pattern > 0 )) || exit # abort, if no files match

printf '%s\n%s\n' "Running from:" "$PWD"
printf '%s\n%s\n' "Pattern matching the files to process:" "$fpattern"
printf '%s\n%s\n' "# of matching files:" "$num_of_files_matching_pattern"

# Determine the range-endpoint-identifier-line regular expressions.
# CAVEAT: Make sure you escape any regular-expression metacharacters you want
#         to be treated as *literals*.
p1='^<!--START: Google Analytics --->$'
p2='^<!--END: Google Analytics --->$'

# Remove the range identified by its endpoints from all matching input files
# and save the original files with extension '.bak'
sed -i'.bak' "/$p1/,/$p2/d" "${fnames[@]}" || exit

顺便提一下:我建议在脚本文件名中不要使用后缀.sh:

  • 文件中的shebang行足以告诉系统将脚本传递给哪个shell/解释器。
  • 如果不指定为后缀,则可以在以后更改实现(例如,Python),而不会破坏依赖脚本的现有程序。
  • 在目前的情况下,假设使用bash实际上是可以接受的,.sh可能会产生误导,因为它建议使用只使用sh功能的脚本。

确定正在运行的脚本的真正目录,甚至在通过位于不同目录中的符号链接调用脚本时确定该脚本的

  • 如果您可以假设一个Linux 平台(或者至少使用 GNU ),请使用: "$(readlink -e - "$BASH_SOURCE")“
  • 否则,需要一个带有助手函数的更详细的解决方案--参见of的this answer
票数 1
EN

Stack Overflow用户

发布于 2016-11-01 00:45:46

sed是用于此任务的

代码语言:javascript
复制
$ sed -i'.bak' '/<!--START/,/<!--END/d' file

如果您有其他具有类似标记的行,则添加更多的模式。

对于多个文件,例如file1、..、file4

代码语言:javascript
复制
$ for f in file{1..4}; do sed -i'.bak' '/<!--START/,/<!--END/d' "$f"; done 
票数 2
EN

Stack Overflow用户

发布于 2016-11-01 01:11:12

需要考虑的事项:

代码语言:javascript
复制
$ awk '/<!--(START|END): Google Analytics --->/{f=!f;next} !f' file
...
</script>

</body>
</html>
...
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/40351989

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档