我不能在新的mac M1中导入pdftext。我采取的步骤是:
pip3 install pdftotext来自终端
import pdftotext类型的
回溯(最近一次调用):文件"",第1行,在导入pdftotext ImportError:'_ZN7poppler24set_debug_error_functionEPFvRKNSt3__112basic_stringIcNS0_11char_traitsIcEENS0_9allocatorIcEEEEPvES9' 0x0002中):在平面命名空间‘_ZN7poppler24set_debug_error_functionEPFvRKNSt3__112basic_stringIcNS0_11char_traitsIcEENS0_9allocatorIcEEEEPvES9’中找不到符号
我已经花了几个小时搜索这条错误消息。
有什么建议吗?
PS:我已经尝试了其他几个pdf ->文本包,但他们没有阅读完整的pdf。由于一些奇怪的原因,我需要阅读的pdfs非常复杂,而且许多包没有完全读取它们。pdftotext有因此,我需要的是帮助使这个pdftotext工作。
发布于 2022-03-06 16:05:15
我不认为pdftotext是个好图书馆。更好地使用PyPDF2,下面是示例
import PyPDF2
#create file object variable
#opening method will be rb
pdffileobj=open('1.pdf','rb')
#create reader variable that will read the pdffileobj
pdfreader=PyPDF2.PdfFileReader(pdffileobj)
#This will store the number of pages of this pdf file
x=pdfreader.numPages
#create a variable that will select the selected number of pages
pageobj=pdfreader.getPage(x+1)
#(x+1) because python indentation starts with 0.
#create text variable which will store all text datafrom pdf file
text=pageobj.extractText()
#save the extracted data from pdf to a txt file
#we will use file handling here
#dont forget to put r before you put the file path
#go to the file location copy the path by right clicking on the file
#click properties and copy the location path and paste it here.
#put "\\your_txtfilename"
file1=open(r"C:\Users\SIDDHI\AppData\Local\Programs\Python\Python38\\1.txt","a")
file1.writelines(text)https://stackoverflow.com/questions/71371871
复制相似问题