首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >用python抓取图像并更改名称

用python抓取图像并更改名称
EN

Stack Overflow用户
提问于 2022-05-24 17:53:55
回答 1查看 52关注 0票数 2

我有一个使用Python从Tumblr抓取图像的项目。我想下载在我得到刮的链接上找到的图片。

这是整个代码:

代码语言:javascript
复制
import requests
from bs4 import BeautifulSoup
import shutil
search_term = "landscape/recent"
posts_scrape = requests.get(f"https://www.tumblr.com/search/{search_term}")
soup = BeautifulSoup(posts_scrape.text, "html.parser")

articles = soup.find_all("article", class_="FtjPK")

data = {}
for article in articles:
    try:
        source = article.find("div", class_="vGkyT").text
        for imgvar in article.find_all("img", alt="Image"):
            data.setdefault(source, []).extend(
                [
                    i.replace("500w", "").strip()
                    for i in imgvar["srcset"].split(",")
                    if "500w" in i
                ]
            )
    except AttributeError:
        continue


for source, image_urls in data.items():
    for url in image_urls:
        if posts_scrape.status_code == 200:
            url.raw.decode_content = True
            with open(source,'wb') as f:
                shutil.copyfileobj(url.raw, f)
            print('Image sucessfully Downloaded: ',source)
        else:
            print('Image Couldn\'t be retrieved')

在这篇文章的答案之后,我更改了代码,并使用了requestshutil

代码语言:javascript
复制
for source, image_urls in data.items():
    for url in image_urls:
        if posts_scrape.status_code == 200:
            url.raw.decode_content = True
            with open(source,'wb') as f:
                shutil.copyfileobj(url.raw, f)
            print('Image sucessfully Downloaded: ',source)
        else:
            print('Image Couldn\'t be retrieved')

现在我发现了一个错误:

代码语言:javascript
复制
Traceback (most recent call last):
  File "/home/user/folder/Information.py", line 28, in <module>
    url.raw.decode_content = True
AttributeError: 'str' object has no attribute 'raw'
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-05-25 08:05:23

您必须再次使用图像URL创建request。然后,您可以获得原始形式的响应并保存图像。

用下面的代码替换代码-

代码语言:javascript
复制
for source, image_urls in data.items():
    for url in image_urls:
        # make request with image url 
        img_scrape = requests.get(url, stream=True)

        if img_scrape.status_code == 200:
            with open(source,'wb') as f:
                img_scrape.raw.decode_content = True
                
                # save the image raw format
                shutil.copyfileobj(r.raw, f)
            print('Image sucessfully Downloaded: ',source)
        else:
            print('Image Couldn\'t be retrieved')

产出-

代码语言:javascript
复制
Image sucessfully Downloaded:  pics-bae
Image sucessfully Downloaded:  pics-bae
Image sucessfully Downloaded:  laravel
Image sucessfully Downloaded:  huariqueje
Image sucessfully Downloaded:  sweetd3lights
Image sucessfully Downloaded:  shesinthegrove
Image sucessfully Downloaded:  careful-disorder
Image sucessfully Downloaded:  beifongkendo
Image sucessfully Downloaded:  traveltoslovenia
Image sucessfully Downloaded:  traveltoslovenia
Image sucessfully Downloaded:  traveltoslovenia
Image sucessfully Downloaded:  bradsbackpack
Image sucessfully Downloaded:  pensamentsisomnis
Image sucessfully Downloaded:  frankfurtphoto
........

..........
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/72367377

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档