首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >按关键字和分组结果对行进行排序

按关键字和分组结果对行进行排序
EN

Stack Overflow用户
提问于 2021-04-25 03:55:39
回答 2查看 54关注 0票数 2

我想像在期望的输出中那样对链接进行排序,但我不知道如何更改我的代码

那么,如何更改代码以获得所需的输出呢?

提前感谢您的帮助

代码:

代码语言:javascript
复制
def my_sort(line):
    social_folders = {'engine': 1,
                    'wormix_mm': 2,
                    'wormix_ok': 3}
    line_fields = line.strip().split("/")
    social = line_fields[3]
    print(line_fields[3])
    return social_folders[social]

numbers = 'First', 'Second', 'Third', 'Fourth'
with open('./testsort.txt') as testsortf, \
     open('./test_out999.txt', "w") as test_out:
    contents = testsortf.readlines()
    contents[-1] = f'{contents[-1]}\n'
    contents.sort(key=my_sort)
    for i, line in enumerate(contents):
        test_out.write(f'{numbers[i]}:\n{line}')
        if i+1 < len(contents):
            test_out.write('\n')

我来自.txt文件的输入:

代码语言:javascript
复制
https://markus.rmart.ru/engine/preloader/somefold
https://markus.rmart.ru/wormix_ok/preloader/somefold3
https://markus.rmart.ru/engine/preloader/somefold3
https://markus.rmart.ru/engine/preloader/somefold1
https://markus.rmart.ru/wormix_ok/preloader/somefold4
https://markus.rmart.ru/wormix_mm/preloader/somefold2
https://markus.rmart.ru/wormix_mm/preloader/somefold1
https://markus.rmart.ru/engine/preloader/somefold2
https://markus.rmart.ru/engine/preloader/somefold5
https://markus.rmart.ru/wormix_mm/preloader/somefold5
https://markus.rmart.ru/wormix_ok/preloader/somefold1

因此,没有任何排序的输入

所需输出:

代码语言:javascript
复制
First:
https://markus.rmart.ru/engine/preloader/somefold
https://markus.rmart.ru/engine/preloader/somefold3
https://markus.rmart.ru/engine/preloader/somefold1
https://markus.rmart.ru/engine/preloader/somefold2
https://markus.rmart.ru/engine/preloader/somefold5

Second:
https://markus.rmart.ru/wormix_mm/preloader/somefold2
https://markus.rmart.ru/wormix_mm/preloader/somefold1
https://markus.rmart.ru/wormix_mm/preloader/somefold5

Third:
https://markus.rmart.ru/wormix_ok/preloader/somefold1
https://markus.rmart.ru/wormix_ok/preloader/somefold4
https://markus.rmart.ru/wormix_ok/preloader/somefold3

现在输出:

代码语言:javascript
复制
First:
https://markus.rmart.ru/engine/preloader/somefold

Second:
https://markus.rmart.ru/engine/preloader/somefold1

Third:
https://markus.rmart.ru/engine/preloader/somefold3

Fourth:
https://markus.rmart.ru/engine/preloader/somefoldtest
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-04-25 05:48:36

尝试这段代码(我在代码中注释了我所做的事情)。

代码语言:javascript
复制
def my_sort(conts):
    social_folders = {'engine': 1, 'wormix_mm': 2, 'wormix_ok': 3}
    line_fields = conts.strip().split("/")
    social = line_fields[3]
    return social_folders[social]


# I didn't know what is the differences between First and second section.
# So I put them together. You can handle that yourself.
numbers = 'First', 'Second', 'Third'#, 'Fourth'
folds = ['engine', 'wormix_mm', 'wormix_ok']
with open('./testsort.txt') as testsortf, open('./test_out999.txt', "w") as test_out:
    contents = testsortf.readlines()
    contents[-1] = f'{contents[-1]}\n'
    contents.sort(key=my_sort)
    # It needs 2 for loops
    for k, fold in enumerate(numbers):
        # Put enter before every category, except the first one
        if k != 0:
            test_out.write(f'\n')
        # Put the label of each category
        test_out.write(f'{numbers[k]}:\n')
        for i, line in enumerate(contents):
            # Put the right label in each category
            if line.strip().split("/")[3] == folds[k]:
                test_out.write(f'{line}')
票数 0
EN

Stack Overflow用户

发布于 2021-04-25 06:17:27

标准库中的itertools.groupby()将按照您想要的方式对链接列表进行集群,但它需要一些设置。具体地说,除了链接的排序迭代之外,它还需要知道分组依据的是链接字符串的哪一部分。为此,需要一些类似于正则表达式的东西来隔离链接的关键部分。

示例:

代码语言:javascript
复制
import re
from itertools import groupby

sorted_links = sorted([
    "https://markus.rmart.ru/engine/preloader/somefold",
    "https://markus.rmart.ru/wormix_ok/preloader/somefold3",
    "https://markus.rmart.ru/engine/preloader/somefold3",
    "https://markus.rmart.ru/engine/preloader/somefold1",
    "https://markus.rmart.ru/wormix_ok/preloader/somefold4",
    "https://markus.rmart.ru/wormix_mm/preloader/somefold2",
    "https://markus.rmart.ru/wormix_mm/preloader/somefold1",
    "https://markus.rmart.ru/engine/preloader/somefold2",
    "https://markus.rmart.ru/engine/preloader/somefold5",
    "https://markus.rmart.ru/wormix_mm/preloader/somefold5",
    "https://markus.rmart.ru/wormix_ok/preloader/somefold1",
])

# Finds the category part of the path that follows the domain name (e.g., "engine")
category = re.compile(r"https.*\.[a-z]{2,3}\/([^\/]*)")

for _, group in groupby(sorted_links, lambda url: category.search(url).group(1)):
    for url in group:
        print(url)
    print()

输出:

代码语言:javascript
复制
https://markus.rmart.ru/engine/preloader/somefold
https://markus.rmart.ru/engine/preloader/somefold1
https://markus.rmart.ru/engine/preloader/somefold2
https://markus.rmart.ru/engine/preloader/somefold3
https://markus.rmart.ru/engine/preloader/somefold5

https://markus.rmart.ru/wormix_mm/preloader/somefold1
https://markus.rmart.ru/wormix_mm/preloader/somefold2
https://markus.rmart.ru/wormix_mm/preloader/somefold5

https://markus.rmart.ru/wormix_ok/preloader/somefold1
https://markus.rmart.ru/wormix_ok/preloader/somefold3
https://markus.rmart.ru/wormix_ok/preloader/somefold4
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/67246902

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档