这是我的档案。
...
</script>
<!--START: Google Analytics --->
<script type="text/javascript"
src="../src/goog/ga_body.js"></script>
<!--END: Google Analytics --->
</body>
</html>
...如何包括删除<!--START: Google Analytics --->和<!--END: Google Analytics --->的所有内容?因此,有效地:
<!--START: Google Analytics --->
<script type="text/javascript"
src="../src/goog/ga_body.js"></script>
<!--END: Google Analytics --->都会消失。剩下的就是
</script>
<nothing here 4 lines deleted>
</body>
</html>我正考虑用bash来做这件事,所以sed和awk可能是我最好的选择,尽管python可能更好。
EDIT1
这是我以前写过的东西,但是它可能是非常糟糕的编码,我将处理这个find2PatternsAndDeleteTextInBetween.sh。
#HEre I want to find 2 patterns and delete whats in between
#this example works
#this is the 2 patterns I want to fine Start and End
#have to use some escape characters here for this to show properly
# have to use \n for it to appear in this format
#<!-- Start of StatCounter Code for DoYourOwnSite -->
# text would go here
#<!-- End of StatCounter Code for DoYourOwnSite -->>
#b="<!-- Start of StatCounter Code for DoYourOwnSite -->"
#b2="<!-- End of StatCounter Code for DoYourOwnSite -->"
#p1="PATTERN-1"
#p2="PATTERN-2"
p1="<!-- Start of StatCounter Code for DoYourOwnSite -->"
p2="<!-- End of StatCounter Code for DoYourOwnSite -->"
fname="*.html"
num_of_files_pattern1=ls #grep $p1 fname
echo "fname(s) to apply the sed to:"
echo $fname
echo "num_of_files_pattern1 is:"
echo $num_of_files_pattern1
echo "Pattern1 is equal to:"
echo $p1
echo "Pattern2 is equal to:"
echo $p2
#this is current dir where the script is
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
echo "DIR is equal to:"
echo $DIR
#cd to the dir where I want to copy the files to:
cd "$DIR"
# this will find the pattern <\head> in all the .html files and place "This should appear before the closing head tag" this before it
# it will also make a backup with .bak extension
#sed -i.bak '/<\\head>/i\This should appear before the closing head tag' *.html
echo "sed on the file"
# this does the head part
#sed '/PATTERN-1/,/PATTERN-2/d' *.txt # this works
#sed "/$p1/,/$p2/d" *.txt # this works
#sed "/$p1/,/$p2/d" $fname # this works
sed -i.bak "/$p1/,/$p2/d" $fname # this works EDIT2
这就是我最后得出的结论,但下面有一个更有力的答案:
# ------------------------------------------------------------------
# [author] find2PatternsAndDeleteTextInBetween.sh
# Description
# Here I want to find 2 patterns and delete what's in between
# this example works
#
# EXAMPLE:
# this is the 2 patterns I want to find Start and End
# <!-- Start of StatCounter Code for DoYourOwnSite -->
# text would go here
# <!-- End of StatCounter Code for DoYourOwnSite -->>
#
# ------------------------------------------------------------------
p1="<!--START: Google Analytics --->"
p2="<!--END: Google Analytics --->"
fname=".html"
echo "fname(s) to apply the sed to:"
echo *"$fname"
echo -e "\n"
echo "Pattern1 is equal to:"
echo -e "$p1\n"
echo "Pattern2 is equal to:"
echo -e "$p2\n"
echo -e "PWD is: $PWD\n"
echo "sed on the file"
#sed '/PATTERN-1/,/PATTERN-2/d' *.txt # this works
#sed "/$p1/,/$p2/d" *.txt # this works
#sed "/$p1/,/$p2/d" $fname # this works
sed -i.bak "/$p1/,/$p2/d" *"$fname" # this works 发布于 2016-11-01 01:52:31
从您问题中的脚本判断,听起来您已经知道如何使用sed从单个文件(sed -i.bak "/$p1/,/$p2/d" $fname)中删除感兴趣的范围,但是正在寻找a 健壮的方法来处理E 29E 110E 211E 112脚本>E 213<代码>E 114(假设E 215E 114)
#!/usr/bin/env bash
# cd to the dir. in which this script is located.
# CAVEAT: Assumes that the script wasn't invoked through a *symlink*
# located in a different dir.
cd -- "$(dirname -- "$BASH_SOURCE")" || exit
fpattern='*.html' # specify source-file globbing pattern
shopt -s failglob # make sure that globbing expands to nothing if nothing matches
fnames=( $fpattern ) # expand to matching files and store in array
num_of_files_matching_pattern=${#fnames[@]} # count matching files
(( num_of_files_matching_pattern > 0 )) || exit # abort, if no files match
printf '%s\n%s\n' "Running from:" "$PWD"
printf '%s\n%s\n' "Pattern matching the files to process:" "$fpattern"
printf '%s\n%s\n' "# of matching files:" "$num_of_files_matching_pattern"
# Determine the range-endpoint-identifier-line regular expressions.
# CAVEAT: Make sure you escape any regular-expression metacharacters you want
# to be treated as *literals*.
p1='^<!--START: Google Analytics --->$'
p2='^<!--END: Google Analytics --->$'
# Remove the range identified by its endpoints from all matching input files
# and save the original files with extension '.bak'
sed -i'.bak' "/$p1/,/$p2/d" "${fnames[@]}" || exit顺便提一下:我建议在脚本文件名中不要使用后缀.sh:
文件中的shebang行足以告诉系统将脚本传递给哪个shell/解释器。如果不指定为后缀,则可以在以后更改实现(例如,Python),而不会破坏依赖脚本的现有程序。在目前的情况下,假设使用bash实际上是可以接受的,.sh可能会产生误导,因为它建议使用只使用sh功能的脚本。确定正在运行的脚本的真正目录,甚至在通过位于不同目录中的符号链接调用脚本时确定该脚本的
如果您可以假设一个Linux 平台(或者至少使用 GNU ),请使用:
"$(readlink -e - "$BASH_SOURCE")“否则,需要一个带有助手函数的更详细的解决方案--参见of的this answer。发布于 2016-11-01 00:45:46
sed是用于此任务的
$ sed -i'.bak' '/<!--START/,/<!--END/d' file如果您有其他具有类似标记的行,则添加更多的模式。
对于多个文件,例如file1、..、file4
$ for f in file{1..4}; do sed -i'.bak' '/<!--START/,/<!--END/d' "$f"; done 发布于 2016-11-01 01:11:12
需要考虑的事项:
$ awk '/<!--(START|END): Google Analytics --->/{f=!f;next} !f' file
...
</script>
</body>
</html>
...https://stackoverflow.com/questions/40351989
复制相似问题