The following style files are required to enable indexing of XML files. Default style files are installed into in the cfusion\verity\common\style
directory (Windows) and opt/coldfusion/verity/common/style
directory (Linux and UNIX).
This section discusses style file configuration used to support XML document filtering.
To index XML documents, the style.uni
must include the following lines:
type: "text/xml"
/format-filter = "flt_xml" /charset= guess /def-charset = 8859
By default, the XML filter indexes regions of the document delimited by XML tags as zones, with the zones given the same name as the XML tag. META tags are automatically indexed as fields unless they are in a suppressed region.
To modify the default behavior, you create a style file named style.xml
. You can specify field and zone indexing for regions of the document delimited by XML tags and skip regions of the document delimited by XML tags.
<?xml version="1.0" encoding="ISO-8859-1"?>
<?note: this is a sample comment line?> <style.xml version="2.6.0"> <?note: ? this following line dictates all xmltags be ignored ? <ignore xmltag="*" /> ?> <?note: ? "ignore" will skip indexing xmltag, yet index contents ? between the beginning and end of this pair of xmltags ?> <?next 2 sample lines commented out: <ignore xmltag="section_1" /> <ignore xmltag="section_2" /> ?> <?note: ? "preserve" indexes xmltag as zone with the presence of ? <ignore xmltag="*" /> ?> <?next 1 sample line commented out: <preserve xmltag="section_3" /> ?> <?note: ? "suppress" will suppress every xmltag embedded within ?> <?next 2 sample lines commented out: <suppress xmltag="region_1" /> <suppress xmltag="region_3" /> ?> <?note: ? "field" will further index content between the beginning ? and end of this pair of xmltags as field values ?> <?next 1 sample line commented out: <field xmltag="column_1" /> ?> <?note: ? if attribute "fieldname" is present, above content will ? be indexed into VDK field under the value of fieldname ? instead of the field under the name of xmltag ?> <?next 1 sample line commented out: <field xmltag="column_2" fieldname="vdk_field_2" /> ?> <?note: ? if attribute "index" is set to "override", above content ? will be indexed into VDK field overriding values read in ? from bulk insert file, if any ?> <?next 1 sample line commented out: <field xmltag="column_3" index="override" /> ?> <?note: ? fieldname & index attributes could both exist ?> </style.xml>
<command attribute="value"/>
Use these commands in the style.xml file to manage how Verity handles individual XML elements. Refer to the style.xml
file listing for examples of these commands.
The following command ignores all XML tags in the document, indexing only the content:
<ignore xmltag = "*"/>
The following command skips indexing the specified xmltag but indexes the content between the start and end tags of the specified xmltag:
<ignore xmltag = "section_1"/>
The following command indexes xmltag as a zone if there is also an ignore xmltag = "*" command:
<preserve xmltag = "section_1"/>
The following command suppresses the entire element identified by xmltag. The tag, attribute, and content are not indexed:
<suppress xmltag = "section_1"/>
The following command indexes the content between the start and end tags of the specified xmltag as a field, which is given the same name as xmltag:
<field xmltag = "column_1"/>
The following command indexes the content between the start and end tags of the specified xmltag
as a field, which is given the name specified in the fieldname
attribute:
<field xmltag = "column_2" fieldname = "vdk_field_2"/>
The following command indexes the content between the start and end tags of the specified xmltag
as a field, overriding any existing value of the field:
<field xmltag = "column_2" index = "override"/>
Note Both |
If administrators have defined custom fields to be populated in the style.xml
file, the fields must also be defined in the style.ufl
file or style.sfl
file, using standard syntax.
To create a collection that contains only XML documents, administrators can modify the style.dft
file to invoke the XML filter directly. In this case, the XML documents do not need a .xml
extension.
The style.dft
must include the following lines:
$control: 1
dft: { field: DOC filter="flt_xml" }