我在这个Statsoft site中有一个html文件,特别是这个部分:
<p>
<a name="Z Distribution (Standard Normal)">
<font color="#000080" size="4">
Z Distribution (Standard Normal).
</font>
</a>
The Z distribution (or standard normal distribution) function is determined by the following formula:
</p>我想要文本The Z distribution (or standard normal distribution) function is determined by the following formula:,我写了一些代码,如下所示:
include('simple_html_dom2.php');
$url = 'http://www.statsoft.com//textbook/statistics-glossary/z/?button=0#Z Distribution (Standard Normal)';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$curl_scraped_page = curl_exec($ch);
$html = new simple_html_dom();
$html->load($curl_scraped_page);
foreach ($html->find('/p/a [size="4"]') as $e) {
echo $e->innertext . '<br>';
}它给了我:Z Distribution (Standard Normal).
我试过写作
foreach ( $html->find('/p/a [size="4"]/font') as $e ) {但它给了我一张白纸。
我错过了什么?谢谢。
发布于 2013-06-25 11:03:33
找到段落,然后从链接中删除文本:
include('simple_html_dom2.php');
$url = 'http://www.statsoft.com//textbook/statistics-glossary/z/?button=0#Z Distribution (Standard Normal)';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$curl_scraped_page = curl_exec($ch);
$html = new simple_html_dom();
$html->load($curl_scraped_page);
foreach ( $html->find('/p/a [size="4"]') as $font ) {
$link = $font->parent();
$paragraph = $link->parent();
$text = str_replace($link->plaintext, '', $paragraph->plaintext);
echo $text;
}原始答案:
您的问题与此相关:Getting the text between two spans with "Simple HTML DOM"
您的选择器正在查找font标记,它的父标记(a标记)是所需文本的同级:
$text = $html->find('/p/a', 0)->next_sibling();https://stackoverflow.com/questions/17288344
复制相似问题