首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >用php从xml中提取信息

用php从xml中提取信息
EN

Stack Overflow用户
提问于 2015-06-23 23:05:26
回答 2查看 196关注 0票数 0

我正在尝试写一个php脚本,将从xml文件中提取信息,并将其放入数据库中。我在CENTOS 6.6上创建了一个L.A.M.P.堆栈。下面的脚本可以识别XML文件中的输入总数,但不会提取任何信息,因为每个部分都有子标记。有没有什么东西可以添加到我的代码中,以便将输入的每个标记中的所有子标记连同文本一起打印到数据库中。

代码语言:javascript
复制
#!/usr/bin/php
<?php
// sample XML data
$data = <<<XML
<entry type="CVE" name="CVE-2003-0002" seq="2003-0002"
published="2003-01-17" modified="2015-04-14" severity="Medium"
CVSS_version="2.0 incomplete approximation" CVSS_score="5.0"
CVSS_base_score="5.0" CVSS_impact_subscore="2.9"
CVSS_exploit_subscore="10.0" CVSS_vector="(AV:N/AC:L/Au:N/C:P/I:N/A:N)" />
<desc>
<descript source="cve">Multiple ethernet Network Interface 'Card' (NIC)   device drivers do not pad frames with null bytes, which allows remote attackers to obtain information from previous packets or kernel memory by using malformed packets, as demonstrated by Etherleak.
</descript>
</desc>
<loss_types>
<conf/>
</loss_types>
<vuln_types>
<design/>
</vuln_types>
<range>
<network/>
</range>
<refs>
<ref source="CERT-VN" url="http://www.kb.cert.org/vuls/id/412115" adv="1">VU#412115</ref>
<ref source="BUGTRAQ" url="http://www.securityfocus.com/archive/1/archive/1/535181/100/0/threaded">20150402 NEW : VMSA-2015-0003 VMware product updates address critical information disclosure issue in JRE</ref>
<ref source="REDHAT" url="http://www.redhat.com/support/errata/RHSA-2003-025.html">RHSA-2003:025</ref>
<ref source="CONFIRM" url="http://www.oracle.com/technetwork/topics/security/cpujan2015-1972971.html">http://www.oracle.com/technetwork/topics/security/cpujan2015-1972971.html</ref>
<ref source="MISC" url="http://www.atstake.com/research/advisories/2003/atstake_etherleak_report.pdf">http://www.atstake.com/research/advisories/2003/atstake_etherleak_report.pdf</ref>
<ref source="ATSTAKE" url="http://www.atstake.com/research/advisories/2003/a010603-1.txt" adv="1">A010603-1</ref><ref source="FULLDISC" url="http://seclists.org/fulldisclosure/2015/Apr/5">20150402 NEW : VMSA-2015-0003 VMware product updates address critical information disclosure issue in JRE</ref>
<ref source="MISC" url="http://packetstormsecurity.com/files/131271/VMware-Security-Advisory-2015-0003.html">http://packetstormsecurity.com/files/131271/VMware-Security-Advisory-2015-0003.html</ref><ref source="BUGTRAQ" url="http://marc.theaimsgroup.com/?l=bugtraq&m=104222046632243&w=2" adv="1">20030110 More information regarding Etherleak</ref>
<ref source="VULNWATCH" url="http://archives.neohapsis.com/archives/vulnwatch/2003-q1/0016.html">20030110 More information regarding Etherleak</ref>
<ref source="BUGTRAQ" url="http://www.securityfocus.com/archive/1/archive/1/307564/30/26270/threaded">20030117 Re: More information regarding Etherleak</ref>
<ref source="BUGTRAQ" url="http://www.securityfocus.com/archive/1/archive/1/305335/30/26420/threaded">20030106 Etherleak: Ethernet frame padding information leakage (A010603-1)</ref>
<ref source="REDHAT" url="http://www.redhat.com/support/errata/RHSA-2003-088.html">RHSA-2003:088</ref><ref source="OSVDB" url="http://www.osvdb.org/9962">9962</ref>
<ref source="OVAL" url="http://oval.mitre.org/repository/data/getDef?id=oval:org.mitre.oval:def:2665" sig="1">oval:org.mitre.oval:def:2665</ref>
</refs>
<vuln_soft>
<prod name="freebsd" vendor="freebsd">
<vers num="4.2"/>
<vers num="4.3"/>
<vers num="4.4"/>
<vers num="4.5"/>
<vers num="4.6"/>
<vers num="4.7"/>
</prod>
<prod name="linux_kernel" vendor="linux">
<vers num="2.4.1"/>
<vers num="2.4.10"/>
<vers num="2.4.11"/>
<vers num="2.4.12"/>
<vers num="2.4.13"/>
<vers num="2.4.14"/>
<vers num="2.4.15"/>
<vers num="2.4.16"/>
<vers num="2.4.17"/>
<vers num="2.4.18"/>
<vers num="2.4.19"/>
<vers num="2.4.2"/>
<vers num="2.4.20"/>
<vers num="2.4.3"/>
<vers num="2.4.4"/>
<vers num="2.4.5"/>
<vers num="2.4.6"/>
<vers num="2.4.7"/>
<vers num="2.4.8"/>
<vers num="2.4.9"/>
</prod>
<prod name="windows_2000" vendor="microsoft">
<vers num="" edition=":advanced_server"/> 
<vers num="" edition=":server"/>
<vers num="" edition=":professional"/>
<vers num="" edition=":datacenter_server"/>
<vers num="" edition="sp1:datacenter_server"/>
<vers num="" edition="sp1:advanced_server"/>
<vers num="" edition="sp1:professional"/>
<vers num="" edition="sp1:server"/>
<vers num="" edition="sp2:datacenter_server"/>
<vers num="" edition="sp2:advanced_server"/>
<vers num="" edition="sp2:professional"/>
<vers num="" edition="sp2:server"/>
</prod>
<prod name="windows_2000_terminal_services" vendor="microsoft">
<vers num="" edition="sp1"/>
<vers num="" edition="sp2"/>
</prod>
<prod name="netbsd" vendor="netbsd">
<vers num="1.5"/>
<vers num="1.5.1"/>
<vers num="1.5.2"/>
<vers num="1.5.3"/>
<vers num="1.6"/>
</prod>
</vuln_soft>
</entry>
XML;

// gather XML data

// database connection settings
$host = 'localhost';
$database = 'cve';
$user = 'admin';
$pass = 'admin';
$table = 'vulnerabilities';

try {
// connect to database
$dbh = new PDO('mysql:host=' . $host . ';dbname=' . $database, $user, $pass);

// prepare xml and iterator
$xml = new SimpleXMLIterator($data);
$itr = new RecursiveIteratorIterator($xml);
// loop through XML data
foreach ($itr as $key => $value) {

    // prepare an insert statement
    $statement = $dbh->prepare("INSERT INTO $table (identifier,seq,published,modified,severity,cvss_verison,cvss_score,cvss_base_score,cvss_impact_subscore,cvss_exploit_subscore,cvs_vector,information,loss_types,vuln_types,impact_area,refs,vuln_soft) VALUES (':name',':seq',':published',':modified',':severity',':CVSS_verison',':CVSS_score',':CVSS_base_score',':CVSS_impact_subscore',':CVSS_exploit_subscore',':CVSS_vector',':desc',':loss_types',':vuln_types',':range',':ref',':vuln_soft')");

    // bind your XML data to named parameters for the insert statement
    $statement->bindParam(':name', $value->attributes()->identifier);
    $statement->bindParam(':seq', $value->attributes()->seq);
    $statement->bindParam(':published', $value->attributes()->published);
    $statement->bindParam(':modified', $value->attributes()->modified);
    $statement->bindParam(':severity', $value->attributes()->severity);
    $statement->bindParam(':CVSS_version', $value->attributes()->cvss_verison);
    $statement->bindParam(':CVSS_score', $value->attributes()->cvss_score);
    $statement->bindParam(':CVSS_base_score', $value->attributes()->cvss_base_score);
    $statement->bindParam(':CVSS_impact_subscore', $value->attributes()->cvss_impact_subscore);
    $statement->bindParam(':CVSS_exploit_subscore', $value->attributes()->cvss_exploit_subscore);
    $statement->bindParam(':CVS_vector', $value->attributes()->cvs_vector);
    $statement->bindParam(':desc',$value->attributes()->information);
    $statement->bindParam(':loss_types',$value->attributes()->loss_types);
    $statement->bindParam(':vuln_types',$value->attributes()->vuln_types);
    $statement->bindParam(':range',$value->attributes()->impact_area);
    $statement->bindParam(':refs',$value->attributes()->refs);
    $statement->bindParam(':vuln_soft',$value->attributes()->vuln_soft);


    // insert XML data into database table
    $statement->execute();
}

$dbh = null;
} catch (PDOException $e) {
print "There was an error: " . $e->getMessage() . "\n";
die();
}

?>

我需要从入口标签收集所有数据,并将其放入数据库中。在标签中包含信息的xml代码示例:

代码语言:javascript
复制
<entry type="CVE" name="CVE-2003-0001" seq="2003-0001"
 published="2003-01-17" modified="2015-04-14" severity="Medium"
 CVSS_version="2.0 incomplete approximation" CVSS_score="5.0"
 CVSS_base_score="5.0" CVSS_impact_subscore="2.9"
 CVSS_exploit_subscore="10.0" CVSS_vector="(AV:N/AC:L/Au:N/C:P/I:N/A:N)">

然后,我需要通过记录子标签data和带有entry标签的标签文本来收集条目标签中的所有数据。带有子标签的xml代码示例:

代码语言:javascript
复制
<refs>
<ref source="reference information">Reference information</ref></refs>
<ref source="reference information">Reference information</ref></refs>
<ref source="reference information">Reference information</ref></refs>
<ref source="reference information">Reference information</ref></refs>
</refs>
</entry>

如上所述,当前脚本在/home/ant244/Documents/extract.php的第112行返回以下警告和on fatal error: PHP : SimpleXMLElement::__construct():Entity:第6行: parser error : Extra content at The document结尾处

代码语言:javascript
复制
PHP Warning:  SimpleXMLElement::__construct(): <desc> in /home/ant244/Documents/extract.php on line 112

PHP Warning:  SimpleXMLElement::__construct(): ^ in /home/ant244/Documents/extract.php on line 112

PHP Fatal error:  Uncaught exception 'Exception' with message 'String could not be parsed as XML' in /home/ant244/Documents/extract.php:112

Stack trace:
#0 /home/ant244/Documents/extract.php(112):  SimpleXMLElement->__construct('<entry type="CV...')

#1 {main}
  thrown in /home/ant244/Documents/extract.php on line 112
EN

回答 2

Stack Overflow用户

发布于 2015-06-24 02:41:46

背景

如果我对您的问题理解正确的话,一种方法可能涉及循环遍历XML数据,同时将找到的数据片段用作已准备好的entry语句中的命名参数。准备好的语句可以帮助保持数据库输入的整洁(有关更多信息,请参阅上的“如何使数据库查询免受SQL注入的安全?”部分)。

这种方法可能类似于下面的示例代码。下面的代码展示了如何将准备好的语句用于数据库工作,以及如何在foreach循环中使用$value->attributes()->name格式访问XML数据(其中name与XML条目中的单个属性相匹配)。

代码示例1(预准备语句)

代码语言:javascript
复制
<?php

// sample XML data
$data = <<<XML
<root>
<entry type="CVE" name="CVE-2003-0001" seq="2003-0001"
published="2003-01-17" modified="2015-04-14" severity="Medium"
CVSS_version="2.0 incomplete approximation" CVSS_score="5.0"
CVSS_base_score="5.0" CVSS_impact_subscore="2.9"
CVSS_exploit_subscore="10.0" CVSS_vector="(AV:N/AC:L/Au:N/C:P/I:N/A:N)" />
<entry type="CVE" name="CVE-2003-0002" seq="2003-0002"
published="2003-01-17" modified="2015-04-14" severity="Medium"
CVSS_version="2.0 incomplete approximation" CVSS_score="5.0"
CVSS_base_score="5.0" CVSS_impact_subscore="2.9"
CVSS_exploit_subscore="10.0" CVSS_vector="(AV:N/AC:L/Au:N/C:P/I:N/A:N)" />
</root>
XML;

// gather XML data
$xml = simplexml_load_string($data);

// database connection settings
$host = 'localhost';
$database = 'your_database';
$user = 'your_username';
$pass = 'your_password';
$table = 'your_database_table';

try {
    // connect to database
    $dbh = new PDO('mysql:host=' . $host . ';dbname=' . $database, $user, $pass);

    // loop through XML data
    foreach ($xml->entry as $key => $value) {

        // prepare an insert statement
        $statement = $dbh->prepare("INSERT INTO $table (name, seq) VALUES (:name, :seq)");

        // bind your XML data to named parameters for the insert statement
        $statement->bindParam(':name', $value->attributes()->name);
        $statement->bindParam(':seq', $value->attributes()->seq);

        // insert XML data into database table
        $statement->execute();
    }

    $dbh = null;
} catch (PDOException $e) {
    print "There was an error: " . $e->getMessage();
    die();
}

?>

但是,在使用嵌套标记时,使用iterator可能是个好主意。在您的示例XML的情况下(简化为下面的代码示例),using an iterator可能如下所示:

代码示例2(迭代器)

代码语言:javascript
复制
<?php

// sample XML data
$data = <<<XML
<root>
<entry>
<refs>
<ref source="reference_information_1">Reference information 1</ref>
<ref source="reference_information_2">Reference information 2</ref>
</refs>
</entry>
</root>
XML;

// prepare XML data and iterator
$xml = new SimpleXMLIterator($data);
$itr = new RecursiveIteratorIterator($xml);

// iterate over each relevant tag
foreach ($itr as $key => $value) {
  echo $key . ": " . $value . "\n";
  echo "source attribute: " . $value->attributes()->source . "\n";
}

?>

此代码生成以下输出:

代码语言:javascript
复制
ref: Reference information 1
source attribute: reference_information_1
ref: Reference information 2
source attribute: reference_information_2

代码示例3(准备好的语句+迭代器)

代码语言:javascript
复制
<?php

// sample XML data
$data = <<<XML
<root>
<entry type="CVE" name="CVE-2003-0001" seq="2003-0001"
published="2003-01-17" modified="2015-04-14" severity="Medium"
CVSS_version="2.0 incomplete approximation" CVSS_score="5.0"
CVSS_base_score="5.0" CVSS_impact_subscore="2.9"
CVSS_exploit_subscore="10.0" CVSS_vector="(AV:N/AC:L/Au:N/C:P/I:N/A:N)" />
<entry type="CVE" name="CVE-2003-0002" seq="2003-0002"
published="2003-01-17" modified="2015-04-14" severity="Medium"
CVSS_version="2.0 incomplete approximation" CVSS_score="5.0"
CVSS_base_score="5.0" CVSS_impact_subscore="2.9"
CVSS_exploit_subscore="10.0" CVSS_vector="(AV:N/AC:L/Au:N/C:P/I:N/A:N)" />
</root>
XML;

// database connection settings
$host = 'localhost';
$database = 'your_database';
$user = 'your_username';
$pass = 'your_password';
$table = 'your_database_table';

try {
    // connect to database
    $dbh = new PDO('mysql:host=' . $host . ';dbname=' . $database, $user, $pass);

    // prepare XML data and iterator
    $xml = new SimpleXMLIterator($data);
    $itr = new RecursiveIteratorIterator($xml);

    // iterate over each relevant tag
    foreach ($itr as $key => $value) {

        // prepare an insert statement
        $statement = $dbh->prepare("INSERT INTO $table (name, seq) VALUES (:name, :seq)");

        // bind your XML data to named parameters for the insert statement
        $statement->bindParam(':name', $value->attributes()->name);
        $statement->bindParam(':seq', $value->attributes()->seq);

        // insert XML data into database table
        $statement->execute();
    }

    $dbh = null;
} catch (PDOException $e) {
    print "There was an error: " . $e->getMessage();
    die();
}

?>

结论

XML语句、和XML迭代器都可以提供安全、方便的方式来处理和数据库相关的应用程序。在您的程序中,将这两个代码示例的思想结合起来可能会有所帮助(通过将Second code Example中的iterators用于First Code Example__的// loop through XML data部分),如Third Code Example中所示。

票数 0
EN

Stack Overflow用户

发布于 2021-07-19 01:40:01

代码语言:javascript
复制
if( get_class( $itrXml ) == 'SimpleXMLIterator' ) { # when the thing is a SimpleXMLIterator
  print $itrXml->__toString( );                     # output its string value

仅当$itrXml是零子节点的叶-node时,这才适用。

如果$itrXml是包含某些子节点的分支节点,则此方法将失败,并显示"Error: Call to a member function __toString() on array“

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/31006514

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档