Short note Document Surrogates
1 Answer

The most common way to show results for a query is to list information about documents in order of their computed relevance to the query.

Alternatively, for pure Boolean ranking, documents are listed according to a metadata attribute, such as date.

Typically the document list consists of the document's title and a subset of important metadata, such as date, source, and length of the article.

In systems with statistical ranking, a numerical score or percentage is also often shown alongside the title, where the score indicates a computed degree of match or probability of relevance.

This kind of information is sometimes referred to as a document surrogate.

There are different types of document surrogates.

1.The document identifier: The identifier is always attached with the document to link the surrogate to it.The user has little significance for this. The identifier formats differ as per their classification schemes. These classification schemes have some predefined structure for identifiers and hence help to provide some structure to the collection of the document. eg. In any department library, the have some sort of identifiers to identify a book. An identifier of type College Name/Department name/Book No. could have a record as VIT/INFT/A0001 for the 1st book.The code can be continued by modifying only the book No. The query to find the 92th entry is easy,but a query to find a book of a particular author or a particular title cannot be easy.Thus Identifier is not sufficient to locate the book.

2.Determining usefulness elements of Document Data associated with the document such as 'Keywords','Abstracts','Introduction','summary','Bibliography',etc can be useful to both the person who needs the information and the designer of the information system.

Determining the usefulness of the document surrogate is a major issue.It can be determined by the way how it is developed.The following explains the elements used for document surrogate:

  1. Keywords: One or set of words chosen by the author or chosen automatically to represent the content of the document

  2. Key Phrases: same as that of keywords

  3. Abstracts: A brief description of the paper

  4. Extracts: Artificially constructed surrogates created by a person other than author

  5. Reviews: A review is similar to the abstracts,but is written by a person who is not an author.

Please log in to add an answer.