首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >查找特定单词的Regex

查找特定单词的Regex
EN

Stack Overflow用户
提问于 2022-08-17 16:06:15
回答 2查看 55关注 0票数 -3

我有一个大文件,其中包含多个条目,如下所示:

代码语言:javascript
复制
{"author":["frack113"],"description":"Detects a Sysmon configuration change, which could be the result of a legitimate reconfiguration or someone trying manipulate the configuration","ruleId":"8ac03a65-6c84-4116-acad-dc1558ff7a77","falsePositives":["Legitimate administrative action"],"from":"now-360s","immutable":false,"outputIndex":".siem-signals-default","meta":{"from":"1m"},"maxSignals":100,"riskScore":35,"riskScoreMapping":[],"severity":"medium","severityMapping":[],"threat":[{"tactic":{"id":"TA0005","reference":"https://attack.mitre.org/tactics/TA0005","name":"Defense Evasion"},"framework":"MITRE ATT&CK®","technique":[]}],"to":"now","references":["https://learn.microsoft.com/en-us/sysinternals/downloads/sysmon"],"version":1,"exceptionsList":[],"index":["winlogbeat-*"],"query":"(winlog.channel:\"Microsoft\\-Windows\\-Sysmon\\/Operational\" AND winlog.event_id:\"16\")","language":"lucene","filters":[],"type":"query"},"schedule":{"interval":"5m"}}

我正在编写python程序,以检测单词"query“之后的字符串,例如

代码语言:javascript
复制
"query":"(winlog.channel:\"Microsoft\\-Windows\\-Sysmon\\/Operational\" AND winlog.event_id:\"16\")"

我正在尝试检测(winlog.channel:\"Microsoft\\-Windows\\-Sysmon\\/Operational\" AND winlog.event_id:\"16\"),我要检测其中的多个,然后使用它与另一个文件中的“查询”进行比较,以找出是否有任何相似之处。

我尝试使用这个regex,但是根本无法检测到“查询”。

代码语言:javascript
复制
(?<=^\"query\":\W)(\w.*)$ 

代码语言:javascript
复制
(?<='{\"query\"}':\s)'?([^'}},]+)

如果有人能给我指点,我会很感激,因为我被困在这上面好几个小时了!

EN

回答 2

Stack Overflow用户

发布于 2022-08-17 16:26:39

您的问题中也有python标记,所以我假设一个涉及python脚本的解决方案应该很好。

假设您有一个以条目作为给定示例的文件data.txt:

代码语言:javascript
复制
{"author":["frack113"],"description":"Detects a Sysmon configuration change, which could be the result of a legitimate reconfiguration or someone trying manipulate the configuration","ruleId":"8ac03a65-6c84-4116-acad-dc1558ff7a77","falsePositives":["Legitimate administrative action"],"from":"now-360s","immutable":false,"outputIndex":".siem-signals-default","meta":{"from":"1m"},"maxSignals":100,"riskScore":35,"riskScoreMapping":[],"severity":"medium","severityMapping":[],"threat":[{"tactic":{"id":"TA0005","reference":"https://attack.mitre.org/tactics/TA0005","name":"Defense Evasion"},"framework":"MITRE ATT&CK®","technique":[]}],"to":"now","references":["https://learn.microsoft.com/en-us/sysinternals/downloads/sysmon"],"version":1,"exceptionsList":[],"index":["winlogbeat-*"],"query":"(winlog.channel:\"Microsoft\\-Windows\\-Sysmon\\/Operational\" AND winlog.event_id:\"16\")","language":"lucene","filters":[],"type":"query"},"schedule":{"interval":"5m"}}

然后,可以运行以下脚本来打印所需的字符串。

代码语言:javascript
复制
def main():
    with open('data.txt') as f:
        for line in f:
            
            line = line.split("query")
            result = line[1]
            result = result.split(")")
            result = result[0][2:]

            print(result)

main()                      

对于您提供的示例字符串,此脚本打印:

代码语言:javascript
复制
"(winlog.channel:\"Microsoft\\-Windows\\-Sysmon\\/Operational\" AND winlog.event_id:\"16\"

希望能帮上忙!

票数 0
EN

Stack Overflow用户

发布于 2022-08-18 07:09:53

2种方法:

  1. 作为json读入,然后遍历字典。2)作为str读入,并对其进行正则化。

1.读入json:

代码语言:javascript
复制
import json

file = 'exportedSignal.ndjson'
with open(file, 'r', encoding = 'cp850') as f:
    jsonData = json.load(f)

queries = []
hits = jsonData['hits']['hits']
for hit in hits:
    if 'query' in hit['_source']['alert']['params'].keys():
        query = hit['_source']['alert']['params']['query']
        queries.append(query)
print(queries)

2.使用Regex:

代码语言:javascript
复制
import re

file = 'exportedSignal.ndjson'
with open(file, 'r', encoding = 'cp850') as f:
    data = f.read()

queries = re.findall('\"query\":\"(.*?)\"', data)
print(queries)

输出:

两者都从"query"键生成2006年值的列表。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/73391659

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档