| You
can search for any word or phrase on a Web site by just typing the word or phrase
into a query form and clicking the button to execute the query (for example, the
Execute Query button on the sample query form). Searches produce a list of files
that contain the word or phrase no matter where they appear in the text.
This list gives the rules for formulating
queries: -
Multiple
consecutive words are treated as a phrase; they must appear in the same order
within a matching document. -
Queries
are case-insensitive, so you can type your query in uppercase or lowercase.
-
You can search for
any word except for those in the exception list (for English, this includes a,
an, and, as, and other common words), which are ignored during
a search. -
Words
in the exception list are treated as placeholders in phrase and proximity queries.
For example, if you searched for Word for Windows, the results could
give you Word for Windows and Word and Windows, because
for is a noise word and appears in the exception list. -
Punctuation marks such as the period
(.), colon (:), semicolon (;), and comma (,) are ignored during a search. -
To use specially treated characters
such as &, |, ^, #, @, $, (, ), in a query, enclose your query in quotation
marks (). -
To
search for a word or phrase containing quotation marks, enclose the entire phrase
in quotation marks and then double the quotation marks around the word or words
you want to surround with quotes. For example, World-Wide Web or Web
searches for World-Wide Web or Web. -
You
can use Boolean operators (AND, OR, and
NOT) and the proximity operator (NEAR)
to specify additional search information. -
The
wildcard character (*) can match words with a given prefix.
The query esc* matches the terms ESC, escape,
and so on. -
Free-text
queries can be specified without regard to query syntax. -
Vector
space queries can be specified. -
Activex
(OLE) and file attribute property value queries
can be issued. Boolean
and Proximity Operators
Boolean
and proximity operators can create a more precise query.
| To
Search For | Example | Results |
| Both terms in the
same page | access
and basic Or access & basic | Pages
with both the words access and basic |
| Either term in a page
| cgi
or isapi Or cgi | isapi | Pages
with the words cgi or isapi | | The
first term without the second term | access
and not basic Or access
& ! basic | Pages
with the word access but not basic |
| Pages not matching a property
value | not
@size = 100 Or ! @size
= 100 | Pages
that are not 100 bytes | | Both
terms in the same page, close together | excel
near project Or excel
~ project | Pages
with the word excel near the word project |
Hints:
-
You can add parentheses to
nest expressions within a query. The expressions in parentheses are evaluated
before the rest of the query. -
Use
double quotes () to indicate that a Boolean or NEAR operator keyword
should be ignored in your query. For example, Abbott and Costello
will match pages with the phrase, not pages that match the Boolean expression.
In addition to being an operator, the word and is a noise word in English. -
The NEAR operator is similar
to the AND operator in that NEAR returns a match if both words being
searched for are in the same page. However, the NEAR operator differs from
AND because the rank assigned by NEAR depends on the proximity of
words. That is, the rank of a page with the searched-for words closer together
is greater than or equal to the rank of a page where the words are farther apart.
If the searched-for words are more than 50 words apart, they are not considered
near enough, and the page is assigned a rank of zero. -
The
NOT operator can be used only after an AND operator in content queries;
it can be used only to exclude pages that match a previous content restriction.
For property value queries, the NOT operator can be used apart from the
AND operator. - The AND
operator has a higher precedence than OR. For example, the first three
queries are equal, but the fourth is not:
a
AND b OR c c OR a AND b c OR (a AND b) (c OR a) AND b Note:
The symbols (&, |, !, ~) and the English keywords AND, OR,
NOT, and NEAR work the same way in all languages supported by Index
Server. Localized keywords are also available when the browser locale is set to
one of the following six languages:
| Language | Keywords |
| German | UND,
ODER, NICHT, NAH | | French | ET,
OU, SANS, PRES | | Spanish | Y,
O, NO, CERCA | | Dutch | EN,
OF, NIET, NABIJ | | Swedish | OCH,
ELLER, INTE, NÄRA | | Italian | E,
O, NO, VICINO | Wildcards
Wildcard
operators help you find pages containing words similar to a given word.
| To
Search For | Example
| Results |
| Words with the same
prefix | comput* | Pages
with words that have the prefix comput, such as computer,
computing, and so on | | Words
based on the same stem word | fly** | Pages
with words based on the same stem as fly, such as flying,
flown, flew, and so on |
Free-Text
Queries
The
query engine finds pages that best match the words and phrases in a free-text
query. This is done by automatically finding pages that match the meaning, not
the exact wording, of the query. Boolean, proximity, and wildcard operators are
ignored within a free-text query. Free-text queries are prefixed with $contents.
| To
Search For | Example
| Results |
| Files that match free-text | $contents
how do I print in Microsoft Excel? | Pages
that mention printing and Microsoft Excel. | Vector
Space Queries
The
query engine supports vector space queries. Vector queries return pages that match
a list of words and phrases. The rank of each page indicates how well the page
matched the query.
| To
Search For | Example | Results |
| Pages that contain
specific words | light,
bulb | Files
with words that best match the words being searched for |
| Pages that contain weighted
prefixes, words, and phrases | invent*,
light[50], bulb[10], "light bulb"[400] | Files
that contain words prefixed by invent, the words light,
bulb, and the phrase light bulb (the terms are weighted) |
- Components
in vector queries are separated by commas.
- Components
in vector queries can be weighted by using the [weight] syntax.
- Pages
returned by vector queries do not necessarily match every term in the query.
- Vector queries work best when the results
are sorted by rank.
Property
Value Queries
Property
value queries can be used to find files that have property values that match a
given criteria. The properties over which you can query include basic file information
like file name and file size, and ActiveX properties including the document summary
(abstract) that is stored in files created by ActiveX-aware applications.
There are two types of property queries:
-
Relational property queries
consist of an at character (@), a property
name, a relational operator, and a property
value. For example, to find all of the files larger than one million bytes,
issue the query @size > 1000000. -
Regular
expression property queries consist of a number sign (#), a property
name, and a regular expression for the property
value. For example, to find to find all of the video (.avi) files, issue the
query #filename *.avi. Regular expressions will never match the special
properties contents (#contents) and all (#all). There may also
be additional format-specific properties that cannot be matched (for example,
#HtmlHRef for HTML pages). Property
Names Property
names are preceded by either the at (@) or number sign (#) character.
Use @ for relational queries, and # for regular expression queries. If
no property name is specified, @contents is assumed. Properties
available for all files include:
| Property
Name | Description |
| All | Matches
any property | | Contents | Words
and phrases in the file and textual properties | | Filename | Name
of the file | | Size | File
size | | Write | Last
time the file was modified | ActiveX
property values can also be used in queries. Web sites with files created by most
ActiveX-aware applications can be queried for these properties: For
a complete list of property names, see the List of
Property Names later on this page. Relational
Operators Relational operators
are used in relational property queries.
| To
Search For | Example | Results |
| Property values in
relation to a fixed value | @size
< 100 @size <= 100 @size = 100 @size != 100 @size >=
100 @size > 100 | Files
whose size matches the query | | Property
values with all of a set of bits on | @attrib
^a 0x820 | Compressed
files with the archive bit on | | Property
values with some of a set of bits on | @attrib
^s 0x20 | Files
with the archive bit on | Property
Values
| To
Search For | Example | Results |
| A specific value | @DocAuthor
= Bill Barnes | Files
authored by Bill Barnes | | Values
beginning with a prefix | #DocAuthor
George* | Files
whose author property begins with George | | Files
with any of a set of extensions | #filename
*.|(exe|,dll|,sys|) | Files
with .exe, .dll, or .sys extensions | | Files
modified after a certain date | @write
> 96/2/14 10:00:00 | Files
modified after February 14, 1996 at 10:00 GMT | | Files
modified after a relative date | @write
> -1d2h | Files
modified in the last 26 hours | | Vectors
matching a vector | @vectorprop
= { 10, 15, 20 } | ActiveX
documents with a vectorprop value of { 10, 15, 20 } | | Vectors
where each value matches a criteria | @vectorprop
>^a 15 | ActiveX
documents with a vectorprop value in which all values in the vector are greater
than 15 | | Vectors
where at least one value matches a criteria | @vectorprop
=^s 15 | ActiveX
documents with a vectorprop value in which at least one value is 15 |
-
Be
sure to use the pound (#) character before the property name when using a regular
expression in a property value, and an at (@) character otherwise.
The equal (=) relational operator is assumed for regular-expression queries. -
File name (#filename) is the only
property that supports regular expressions with wildcards to the left of
text. This is the only case where wildcards to the left are efficient. -
Date and time values are of the
form yyyy/mm/dd hh:mm:ss. The first two characters of the year and the
entire time can be omitted. Dates and times are in Greenwich Mean Time (GMT). -
Dates and times relative to the
current time can be expressed with a minus (-) character followed by zero or by
more integer unit and time unit pairs. Time units are expressed as: (y) for years,
(m) for months, (w) for weeks, (d) for days, (h) for hours, (n) for minutes, and
(s) for seconds. -
Currency
values are of the form x.y, where x is the whole value amount and
y is the fractional amount. There is no assumption about units. -
Boolean values are (t) or (true)
for TRUE and (f) or (false) for FALSE. -
Vectors
(VT_VECTOR) are expressed as an opening brace ({), followed by a comma-separated
list of values, then a closing brace (}). -
Single-value
expressions that are compared against vectors are expressed as a relational
operator, then a (^a) for all of or a (^s) for some of. -
Numeric values can be in decimal
or hexadecimal (preceded by 0x). -
The
contents property does not support relational operators. If a relational
operator is specified, no results will be found. For example, @contents Microsoft
will find documents containing Microsoft, but @contents=Microsoft
will find none. Regular
Expressions Regular expressions
in property queries are defined as follows: -
Any character except asterisk (*),
period (.), question mark (?), and vertical bar (|) defaults to matching just
itself. -
Regular
expressions can be enclosed in matching quotes (), and must be enclosed
in quotes if they contain a space ( ) or closing parenthesis ()). -
The characters *, ., and ? behave
as they behave in Windows; they match any number of characters, match (.) or end
of string, and match any one character, respectively. -
The
character | is an escape character. After |, the following characters have special
meaning: -
Between
square brackets ([]) the following characters have special meaning:
-
Between curly braces ({}) the following
syntax applies: - |{m|} matches
exactly m occurrences of the preceding expression. (0 < m < 256).
- |{m,|} matches at least m occurrences
of the preceding expression. (1 < m < 256).
-
|{m,n|}
matches between m and n occurrences of the preceding expression,
inclusive. (0 < m < 256, 0 < n < 256). -
To match *, ., and ?, enclose them
in brackets (for example, |[*]sample will match *sample).
Query
Examples
| Example | Results |
@size > 1000000 | Pages
larger than one million bytes | @write
> 95/12/23 | Pages
modified after the date | Apple
tree | Pages
with the phrase apple tree | "apple
tree" | Same
as above | @contents
apple tree | Same
as above | Microsoft
and @size > 1000000 | Pages
with the word Microsoft that are larger than one million bytes |
"microsoft
and @size > 1000000" | Pages
with the phrase specified (not the same as above) | #filename
*.avi | Video
files (the # prefix is used because the query contains a regular expression) |
@attrib ^s 32 | Pages
with the archive attribute bit on | @docauthor
= John Smith | Pages
with the given author | $contents
why is the sky blue? | Pages
that match the query | @size
< 100 & #filename *.gif | Graphics
Interchange Format (GIF) files less than 100 bytes in size |
List
of Property Names
These
properties are always available for queries. Additional properties may also be
available depending on the configuration of the Web server.
| Friendly
Name | Datatype | Property |
| Access | DBTYPE_DATE | Last
time file was accessed. | | All | (not
applicable) | Searches
every property for a string. Can be queried but not retrieved. |
| AllocSize | DBTYPE_I8 | Size
of disk allocation for file. | | Attrib | DBTYPE_UI4 | File
attributes. Documented in Win32 SDK. | | ClassId | DBTYPE_GUID | Class
ID of object, for example, WordPerfect, Word, and so on. |
| Change | DBTYPE_DATE | Last
time file was changed (includes changes to attributes). |
| Characterization | DBTYPE_WSTR
| DBTYPE_BYREF | Characterization,
or abstract, of document. Computed by Index Server. | | Contents | (not
applicable) | Main
contents of file. Can be queried but not retrieved. |
| Create | DBTYPE_DATE | Time
file was created. | | DocAppName | DBTYPE_STR
| DBTYPE_BYREF | Name
of application that created the file. | | DocAuthor | DBTYPE_STR
| DBTYPE_BYREF | Author
of document. | | DocCategory | DBTYPE_STR | Type
of document such as a memo, schedule, or whitepaper. | | DocCharCount | DBTYPE_I4 | Number
of characters in document. | | DocComments | DBTYPE_STR
| DBTYPE_BYREF | Comments
about document. | | DocCompany | DBTYPE_STR | Name
of the company for which the document was written. | | DocCreatedTm | DBTYPE_DATE | Time
document was created. | | DocEditTime | DBTYPE_DATE | Total
time spent editing document. | | DocKeywords | DBTYPE_STR
| DBTYPE_BYREF | Document
keywords. | | DocLastAuthor | DBTYPE_STR
| DBTYPE_BYREF | Most
recent user who edited document. | | DocLastPrinted | DBTYPE_DATE | Time
document was last printed. | | DocLastSavedTm | DBTYPE_DATE | Time
document was last saved. | | DocManager | DBTYPE_STR | Name
of the manager of the documents author. | | DocPageCount | DBTYPE_I4 | Number
of pages in document. | | DocRevNumber | DBTYPE_STR
| DBTYPE_BYREF | Current
version number of document. | | DocSubject | DBTYPE_STR
| DBTYPE_BYREF | Subject
of document. | | DocTemplate | DBTYPE_STR
| DBTYPE_BYREF | Name
of template for document. | | DocTitle | DBTYPE_STR
| DBTYPE_BYREF | Title
of document. | | DocWordCount | DBTYPE_I4 | Number
of words in document. | | FileIndex | DBTYPE_I8 | Unique
ID of file. | | FileName | DBTYPE_WSTR
| DBTYPE_BYREF | Name
of file. | | HitCount | DBTYPE_I4 | Number
of hits (words matching query) in file. | | HtmlHRef | DBTYPE_WSTR
| DBTYPE_BYREF | Text
of HTML HREF. Can be queried but not retrieved. | | HtmlHeading1 | DBTYPE_WSTR
| DBTYPE_BYREF | Text
of HTML document in style H1. Can be queried but not retrieved. |
| HtmlHeading2 | DBTYPE_WSTR
| DBTYPE_BYREF | Text
of HTML document in style H2. Can be queried but not retrieved. |
| HtmlHeading3 | DBTYPE_WSTR
| DBTYPE_BYREF | Text
of HTML document in style H3. Can be queried but not retrieved. |
| HtmlHeading4 | DBTYPE_WSTR
| DBTYPE_BYREF | Text
of HTML document in style H4. Can be queried but not retrieved. |
| HtmlHeading5 | DBTYPE_WSTR
| DBTYPE_BYREF | Text
of HTML document in style H5. Can be queried but not retrieved. |
| HtmlHeading6 | DBTYPE_WSTR
| DBTYPE_BYREF | Text
of HTML document in style H6. Can be queried but not retrieved. |
| Path | DBTYPE_WSTR
| DBTYPE_BYREF | Full
physical path to file, including file name. | | Rank | DBTYPE_I4 | Rank
of row. Ranges from 0 to 1000. Larger numbers indicate better matches. |
| RankVector | DBTYPE_I4
| DBTYPE_VECTOR | Ranks
of individual components of a vector query. |
| SecurityChange | DBTYPE_DATE | Last
time security was changed on file. | | ShortFileName | DBTYPE_WSTR
| DBTYPE_BYREF | Short
(8.3) file name. | | Size | DBTYPE_I8 | Size
of file, in bytes. | | USN | DBTYPE_I8 | Update
Sequence Number. NTFS drives only. | | VPath | DBTYPE_WSTR
| DBTYPE_BYREF | Full
virtual path to file, including file name. If more than one possible path, then
the best match for the specific query is chosen. | | WorkId | DBTYPE_I4 | Internal
ID for file. Used within Index Server. | | Write | DBTYPE_DATE | Last
time file was written. | |