我在试着抽些绳子。从本文中:
text = "<li>(<a rel="nofollow" class="external text" href="http://www.icd9data.com/getICD9Code.ashx?
icd9=999.1">999.1</a>) <a href="/wiki/Air_embolism" title="Air embolism">Air embolism</a> as
a complication of medical care not elsewhere classified</li>"我的目标是“作为未分类的医疗服务的并发症”,但语法不起作用:
soup = bs4.Beautifulsoup(text)
for tag in soup.find_all('li'):
print tag.string有人知道有什么方法可以调用我想要的字符串吗?谢谢。
发布于 2014-05-06 22:16:18
for tag in soup.find_all('li'):
print(tag.get_text())版画
(999.1) Air embolism as
a complication of medical care not elsewhere classifiedget_text方法返回标记中的所有文本,甚至是作为子标记一部分的文本。
使用lxml,您可以使用
import lxml.html as LH
text = """<li>(<a rel="nofollow" class="external text" href="http://www.icd9data.com/getICD9Code.ashx?
icd9=999.1">999.1</a>) <a href="/wiki/Air_embolism" title="Air embolism">Air embolism</a> as
a complication of medical care not elsewhere classified</li>"""
doc = LH.fromstring(text)
for tag in doc.xpath('//li/a[2]'):
print(tag.tail)获得
as
a complication of medical care not elsewhere classifiedhttps://stackoverflow.com/questions/23505303
复制相似问题