php xmlpipe2
来源:互联网 发布:旋转矩阵 坐标空间 点 编辑:程序博客网 时间:2024/06/01 17:13
In the last article we successfully created a PHP class that outputs XML as input for Sphinx’s indexer. However it was incredibly inefficient as we had to hold everything in memory. Here is an updated class that extends XMLWriter, which is a built in PHP class that is essentially undocumented and works great for creating memory efficient streams of XML data. Rather than keeping each document in memory, XMLWriter will allow us to immediately flush that document’s XML elements to standard output. We can use it as follows: As you can see the first thing we need to do is populate the fields and attributes. Once that is done, we call beginOutput, that will create the head of the XML document. After each document is added, the document’s xml markup is immediately outputted and the memory buffer is cleared. Finally we call endOutput, which will close the sphinx:docset element. I have used this class in production to index millions of records that take up dozens of gigabytes. Keep in mind if you are working with that much data, you will probably need to bach your queries so you are not loading all the records at once!Sphinx xmlpipe2 in PHP: Part II
<?php/* * SphinxXMLFeed - efficiently generate XML for Sphinx's xmlpipe2 data adapter * (c) 2009 Jetpack LLC http://jetpackweb.com */class SphinxXMLFeed extends XMLWriter{ private $fields = array(); private $attributes = array(); public function __construct($options = array()) { $defaults = array( 'indent' => false, ); $options = array_merge($defaults, $options); // Store the xml tree in memory $this->openMemory(); if($options['indent']) { $this->setIndent(true); } } public function setFields($fields) { $this->fields = $fields; } public function setAttributes($attributes) { $this->attributes = $attributes; } public function addDocument($doc) { $this->startElement('sphinx:document'); $this->writeAttribute('id', $doc['id']); foreach($doc as $key => $value) { // Skip the id key since that is an element attribute if($key == 'id') continue; $this->startElement($key); $this->text($value); $this->endElement(); } $this->endElement(); print $this->outputMemory(); } public function beginOutput() { $this->startDocument('1.0', 'UTF-8'); $this->startElement('sphinx:docset'); $this->startElement('sphinx:schema'); // add fields to the schema foreach($this->fields as $field) { $this->startElement('sphinx:field'); $this->writeAttribute('name', $field); $this->endElement(); } // add attributes to the schema foreach($this->attributes as $attributes) { $this->startElement('sphinx:attr'); foreach($attributes as $key => $value) { $this->writeAttribute($key, $value); } $this->endElement(); } // end sphinx:schema $this->endElement(); print $this->outputMemory(); } public function endOutput() { // end sphinx:docset $this->endElement(); print $this->outputMemory(); }}
$doc = new SphinxXMLFeed(); $doc->setFields(array( 'title', 'teaser', 'content',)); $doc->setAttributes(array( array('name' => 'blog_id', 'type' => 'int', 'bits' => '16', 'default' => '0'),)); $doc->beginOutput(); foreach(range(1, 1000) as $id) { $doc->addDocument(array( 'id' => $id, 'blog_id' => rand(1, 10), 'title' => "Article Part {$id}", 'teaser' => "Article {$id} teaster", 'content' => "Article {$id} content", ));} $doc->endOutput();
- php xmlpipe2
- slphinx xmlpipe2
- PHP+MongoDB+Coreseek/Sphinx(xmlpipe2数据源)打造千万级搜索引擎
- coreseek oracle xmlpipe2配置
- sphinx xmlpipe2数据源配置
- sphinx 采用c扩展xmlpipe2数据源
- sphinx 使用 python xmlpipe2 数据源 生成索引
- sphinx 采用c扩展xmlpipe2数据源 .
- 测试sphinx/coreseek xmlpipe2 support NOT compiled
- Sphinx与mysql和sphinx与mongodb分别结合通过xmlpipe2 生成索引库的方法
- PHP
- PHP
- php
- php
- PHP
- PHP
- php
- php
- 从薪水说看虚函数与多态
- ASP.NET 2.0页面框架的几处变化
- 页面传值
- 复杂声明分析
- Oracle PL/SQL之不能在function里面调用DBMS_LOCK(Grant to role OR Grant to user)
- php xmlpipe2
- 微软sql的建议笔记
- PIC16F616 的一个总结
- php C++ - zoj 1180 Self Numbers
- JS中正则几个概念问题
- java尝试HTTP多线程断点续传下载
- jquery 插件(二) lettering.js
- 重建 SQL Server 2008 系统数据库
- 带参数的宏和函数的优缺点