Using the didump
utility, you can view key components of the word index per partition. The word list consists of a list of all words indexed by the Verity engine. The zone list is a list of all zones found by the engine. The zone attribute list is a list of the zone attributes found by the engine.
didump
can be found in the ColdFusion bin directory:
cfusion\bin
(Windows)
opt/coldfusion/verity/<platform>/bin
(UNIX), where <platform>
is _ssol26
, _hpux11
, or _ilnx21
.c:\cfusion\bin\didump /common = c:\cfusion\verity\common -pattern llama
c:\new\parts\00000001.did
You can view the contents of the word list for a partition by using the didump utility with the -words flag. The command-line syntax must include the -words flag and a path name to a partition file, like this:
didump -words /z/collbldg/html/parts/00000003.did
The display provides an alphabetical listing of the words in the word index, as shown below.
didump - Verity, Inc. Version 2.5.0 (_nti31, Jul 7 1999)
Text Size Doc Word A 10 3 4 a 34 5 24 abbreviations 4 1 1 about 4 1 1 acronym 5 1 2 acronyms 4 1 1 actual 4 1 1 administrator 3 1 1 advance 3 1 1 all 8 2 3 also 9 2 4 Always 4 1 1 always 9 2 3 ampersand 4 1 1
The columns in the display indicate:
Size
The number of bytes used by the Verity engine to store information about the word
Doc
The number of unique documents in which the word appearsWord
The total number of occurrences of a word for the partitionTo view the occurrences of a specific word or pattern, enter a command using the -pattern option, as in the following example:
didump -pattern acronym 00000003.did
The didump utility will display information about the number of occurrences of the word "acronym." You can display the individual occurrences of a word using the verbose (-verbose) option.
The zone list contains a list of the zones identified by the zone filter. The zones listed can be searched using the Verity IN operator in a query. To view the contents of zone list, use didump with the -zones flag plus the path name to a partition, like this:
didump -zones /z/collbldg/html/parts/00000003.did
The partition above is for a collection containing the Verity Collection Building Guide in HTML format. The Verity universal filter invoked the HTML filter by default and indexed the documents using these zones.
didump - Verity, Inc. Version 2.5.0 (_solaris, Jul 07 1999)
ZoneName Fmt Size Doc Regions A Wct 10239 85 5016 ADDRESS Array 34 1 1 BODY Array 197 85 85 CAPTION Wct 298 31 85 CODE Wct 3868 66 1829 H1 Array 80 83 83 H2 Wct 646 53 212 H3 Wct 517 49 171 H4 Wct 128 8 47 HEAD Array 70 85 85 HTML Array 165 85 85 TITLE Array 70 85 85
The columns in the display indicate:
Fmt
The internal data format used to store the zone information.
Size
The number of bytes used by the Verity engine to store information about the zone.Doc
The number of unique documents in which the zone appearsRegion
The total number of instances of a zone for the partitionFor complete information about the how zones are defined, refer to Chapter 11.
The zone attribute list contains a list of the HTML attributes for the zones identified by the HTML zone filter. The zone attributes listed can be searched using the Verity IN operator together with the WHEN operator in a query. To view the contents of the zone attributes list, use didump with the -attributes flag plus the path name to a partition, like this:
didump -attributes /z/collbldg/html/parts/00000003.did
The partition above is for a collection containing the Verity Collection Building Guide in HTML format.
didump - Verity, Inc. Version 2.5.0 (_solaris, Jul 9 1999)
Text Size Doc Word href 01_cbg.htm 10 2 4 href 01_cbg.htm#282870 3 1 1 href 01_cbg.htm#282872 6 2 2 href 01_cbg1.htm 8 2 3 href 01_cbg1.htm#286513 7 2 2 href 01_cbg1.htm#286520 3 1 1 ...
The columns in the display indicate: