我正在从pdf文件中提取选定的页面。并希望根据提取的页面分配数据帧名称:
file = "abc"
selected_pages = ['10','11'] #can be any combination eg ['6','14','20]
for i in selected_pages():
df{str(i)} = read_pdf(path + file + ".pdf",encoding = 'ISO-8859-1', stream = True,area = [100,10,740,950],pages= (i), index = False)
print (df{str(i)} )最终,就像上面的例子一样,这个想法是有数据帧的: df10,df11。我尝试过"df“+ str(i),"df”& str(i) & df{str(i)}。然而,所有的错误消息: SyntaxError:无效语法或任何更好的方法都是最受欢迎的。谢谢
发布于 2019-10-13 16:34:06
这就是字典是更好的选择的地方。
还要注意在循环开始时出现的错误。selected_pages是一个列表,所以你不能使用selected_pages()。
file = "abc"
selected_pages = ['10','11'] #can be any combination eg ['6','14','20]
df = {}
for i in selected_pages:
df[i] = read_pdf(path + file + ".pdf",encoding = 'ISO-8859-1', stream = True, area = [100,10,740,950], pages= (i), index = False)发布于 2019-10-14 20:19:48
i = int(i) - 1 # this will bring it to 10
dfB = df[str(i)]
#select row number to drop: 0:4
dfB.drop(dfB.index[0:4],axis =0, inplace = True)
dfB.columns = ['col1','col2','col3','col4','col5']https://stackoverflow.com/questions/58361807
复制相似问题