You can use the MIME type criteria options -mimeinclude, -indmimeinclude, -mimeexclude and -indmimeexclude to include or exclude MIME types.
When you specify MIME type criteria, keep in mind the following restrictions.
The asterisk (*) wildcard character does not operate as a regular expression for the value of the MIME type criteria. Instead you can only use it to replace the entire MIME type or MIME sub-type.
For example, the following value is a valid substitute for text/html:
text/*
The following value is NOT a valid substitute for text/html:
text/h*
When you specify a series of parameter values for a single instance of one of the MIME type criteria, and you use quotes, you must enclose each separate parameter value in single quotes.
-mimeinclude 'text/plain' 'application/*'
If you enclose the entire sequence of parameter values,
-mimeinclude 'text/plain application/*'
the Verity Spider will consider the entire expression as a single value.
You can also use multiple instances of the MIME type criteria, each with a single parameter value, where quotes are necessary only if you use the wildcard character (*).
-mimeinclude text/plain
-mimeinclude 'application/*'.Setting MIME Types
When you index a Web site, the Verity Spider evaluates your MIME Type criteria against the "Content-Type" HTTP headers sent by the Web server hosting that Web site. That Web server passes along MIME Type information based on its own internal tables.
When you encounter MIME Types being dropped, make sure the Web server you are indexing has the necessary MIME Type information. See the documentation for your Web server for information about specifying MIME Types.
You can examine the indexing job's log files for indications that files are being skipped due to MIME Types. For example, a typical ASCII file you might want indexed is a log file (filename.log). Unless the Web server understands that files with .LOG extensions are ASCII text, of MIME Type text/plain, you will see in the indexing job log file that .LOG files are skipped because of MIME Type even if you use:
-mimeinclude 'text/*'
When you index a file system, the Verity Spider reads filenames and evaluates your MIME Type criteria against an internal, compiled list of known MIME Types and associated file extensions. You cannot edit this list. However, you can use the -mimemap
option to create a custom MIME Type mapping.
When you encounter MIME Types being dropped, check if the Verity Spider recognizes that particular MIME Type. See the table, "Known MIME types for file system indexing" for more details.
You can examine the indexing job's log files for indications that files are being skipped due to MIME types. For example, a typical ASCII file you might want indexed is a log file (filename.log
). Since the Verity Spider does not understand that files with .LOG extensions are ASCII text, of MIME Type text/plain, you will see in the indexing job log file that .LOG files are skipped because of MIME Type even if you use:
-mimeinclude 'text/*'.Setting MIME Types
Whenever you find MIME Types being dropped, or you know you will be indexing files whose extensions are not known to the Verity Spider by default, use the -mimemap
option to point to a file which contains your own custom mappings for file extensions and MIME Types.
You can also use the regular expression '*/*' for your MIME Type criteria.
-mimeinclude '*/*'
Remember, on either platform you need to include single quotes for values which include wildcard characters.
Furthermore, you should also use inclusion and exclusion criteria to finely control what is indexed.
-indexclude
, -mimeexclude
, or -indmimeexclude
) to exclude extensions you know you do not want to index. For example:
-exclude '*.exe' '*.com'
-include '*.txt' '*.1st' '*.log'.Setting MIME Types
The MIME Types which the Verity Spider recognizes when indexing file systems are listed in the following table.