Differences between revisions 1 and 3 (spanning 2 versions)
Revision 1 as of 2008-06-26 12:26:10
Size: 2109
Editor: kfc
Comment: Created by the PackagePages action.
Revision 3 as of 2008-10-17 10:11:17
Size: 3911
Editor: bam
Comment: Search API Work in Progress
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
An API for searching DOMS needs to be provided. DOMS Search uses simple search methods of the Summa Search interface.
Line 5: Line 5:
Two possibilities exist WSDL: attachment:DomsGUISearch.wsdl
Line 7: Line 7:
 * Using GSearch - http://defxws2006.cvt.dk/fedoragsearch/
   * Pros: Easily set up, simple interface
   * Cons: Searches on a one-fedora-object basis
 * Using Summa - https://gforge.statsbiblioteket.dk/projects/summa
   * Pros: Searches on entire metadata records
   * Cons: Difficult to setup, more complicated interface
Content of this page:
 * [#operations Operations]
 * [#resultXML Result XML Definition]
 * [#example Result XML Example]
Line 14: Line 12:
Probably, Summa is the best bet. [[Anchor(operations)]]
== Operations ==
Line 16: Line 15:
Summa webservice search interface WSDL extract: === simpleSearch ===
This method executes the given query and returns a search result ranked by relevance.

Input parameters:
 * {{{String query}}} The query string.
 * {{{int numberOfRecords}}} The maximum number of records returned in search result.
 * {{{int startIndex}}} The number of the first record to return.

Returns:
 * {{{String simpleSearchReturn}}} The search result sorted by relevance as structured XML document. See [#resultXML definition] below.

Throws:
 * {{{java.rmi.RemoteException}}}

=== simpleSearchSorted ===
This method executes the given query and returns a search result ranked by the given sort key.

Input parameters:
 * {{{String query}}} The query string.
 * {{{int numberOfRecords}}} The maximum number of records returned in search result.
 * {{{int startIndex}}} The number of the first record to return.
 * {{{String sortKey}}} The key to sort by.
 * {{{boolean reverse}}} A boolean indication whether or not to sort in reverse.

Returns:
 * {{{String simpleSearchReturn}}} The search result sorted by the given key sort, reversed if so indicated, as structured XML document. See [#resultXML definition] below.

Throws:
 * {{{java.rmi.RemoteException}}}

[[Anchor(resultXML)]]
== Result XML ==

TODO

The result string is actually XML, in the following form:
Line 19: Line 53:
   <element name="simpleSearch">
    <complexType>
     <sequence>
      <element name="query" type="xsd:string"/>
      <element name="numberOfRecords" type="xsd:int"/>
      <element name="startIndex" type="xsd:int"/>
     </sequence>
    </complexType>
   </element>
   <element name="simpleSearchResponse">
    <complexType>
     <sequence>
      <element name="simpleSearchReturn" type="xsd:string"/>
     </sequence>
    </complexType>
   </element>
   <element name="simpleSearchSorted">
    <complexType>
     <sequence>
      <element name="query" type="xsd:string"/>
      <element name="numberOfRecords" type="xsd:int"/>
      <element name="startIndex" type="xsd:int"/>
      <element name="sortKey" type="xsd:string"/>
      <element name="reverse" type="xsd:boolean"/>
     </sequence>
    </complexType>
   </element>
   <element name="simpleSearchSortedResponse">
    <complexType>
     <sequence>
      <element name="simpleSearchSortedReturn" type="xsd:string"/>
     </sequence>
    </complexType>
   </element>
<responsecollection>
<response name="DocumentResponse">
<documentresult query="Hans" startIndex="0" maxRecords="20"
sortKey="summa-score" reverseSort="false" fields="recordID, shortformat"
searchTime="105" hitCount="1">
  <record score="0.37572" id="0" source="NA">
    <field name="recordID">fagref:hj@example.com</field>
    <field name="shortformat">&amp;lt;shortrecord&amp;gt;
&amp;lt;rdf:RDF
xmlns:rdf=&amp;quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;&gt;
&amp;lt;rdf:Description&amp;gt;
&amp;lt;dc:title&amp;gt;Fagekspert i Datalogi&amp;lt;/dc:title&amp;gt;
&amp;lt;dc:creator&amp;gt;Hans Jensen&amp;lt;/dc:creator&amp;gt;
&amp;lt;dc:type
xml:lang=&amp;quot;da&amp;quot;&amp;gt;person&amp;lt;/dc:type&amp;gt;
&amp;lt;dc:type
xml:lang=&amp;quot;en&amp;quot;&amp;gt;person&amp;lt;/dc:type&amp;gt;
&amp;lt;dc:identifier&amp;gt;hj@example.com&amp;lt;/dc:identifier&amp;gt;
&amp;lt;/rdf:Description&amp;gt;
&amp;lt;/rdf:RDF&amp;gt;
&amp;lt;/shortrecord&amp;gt;</field>
  </record>
</documentresult>
</response>
</responsecollection>
Line 55: Line 80:
Currently, we don't have a schema for the result set.
Line 56: Line 82:
Result will be of the form The result can be read as follows:
Line 58: Line 84:
documentresult element
 * Attributes query, startIndex, maxRecords, sortKey, reverseSort: Same as input to method
 * Attribute fields: Always "recordID, shortformat"
 * Attribute searchTime: Time it took to search
 * Attribute hitCount: Number of results

record element
 * Attribute score: value from 0 to 1 with relevancy ranking
 * Attributes id, name: Not used

field element
 * Attribute name: Always either recordID or shortformat
 * Contents are the PID for recordID, or XML for shortformat.

The format of the "shortformat" field is (decoded version of contents above):
Line 60: Line 101:
       <?xml version="1.0" encoding="UTF-8"?>
       <searchresult filter="..." query="..."
                     startIndex="..." maxRecords="..."
                     sortKey="..." reverseSort="..."
                     fields="..." searchTime="..." hitCount="...">
         <record score="..." sortValue="...">
           <field name="recordID">...</field>
           <field name="shortformat">...</field>
         </record>
         ...
       </searchresult>
<shortrecord>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description>
<dc:title>Fagekspert i Datalogi</dc:title>
<dc:creator>Hans Jensen</dc:creator>
<dc:type
xml:lang="da">person</dc:type>
<dc:type
xml:lang="en">person</dc:type>
<dc:identifier>hj@example.com</dc:identifier>
</rdf:Description>
</rdf:RDF>
</shortrecord>
Line 72: Line 116:

Important elements are the "dc" fields. They will contain the actual results.

[[Anchor(example)]]
== Result XML Example ==

Search API

DOMS Search uses simple search methods of the Summa Search interface.

WSDL: attachment:DomsGUISearch.wsdl

Content of this page:

  • [#operations Operations]
  • [#resultXML Result XML Definition]
  • [#example Result XML Example]

Anchor(operations)

Operations

simpleSearch

This method executes the given query and returns a search result ranked by relevance.

Input parameters:

  • String query The query string.

  • int numberOfRecords The maximum number of records returned in search result.

  • int startIndex The number of the first record to return.

Returns:

  • String simpleSearchReturn The search result sorted by relevance as structured XML document. See [#resultXML definition] below.

Throws:

  • java.rmi.RemoteException

simpleSearchSorted

This method executes the given query and returns a search result ranked by the given sort key.

Input parameters:

  • String query The query string.

  • int numberOfRecords The maximum number of records returned in search result.

  • int startIndex The number of the first record to return.

  • String sortKey The key to sort by.

  • boolean reverse A boolean indication whether or not to sort in reverse.

Returns:

  • String simpleSearchReturn The search result sorted by the given key sort, reversed if so indicated, as structured XML document. See [#resultXML definition] below.

Throws:

  • java.rmi.RemoteException

Anchor(resultXML)

Result XML

TODO

The result string is actually XML, in the following form:

<responsecollection>
<response name="DocumentResponse">
<documentresult query="Hans" startIndex="0" maxRecords="20"
sortKey="summa-score" reverseSort="false" fields="recordID, shortformat"
searchTime="105" hitCount="1">
  <record score="0.37572" id="0" source="NA">
    <field name="recordID">fagref:hj@example.com</field>
    <field name="shortformat">&amp;lt;shortrecord&amp;gt;
&amp;lt;rdf:RDF
xmlns:rdf=&amp;quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;&gt;
&amp;lt;rdf:Description&amp;gt;
&amp;lt;dc:title&amp;gt;Fagekspert i Datalogi&amp;lt;/dc:title&amp;gt;
&amp;lt;dc:creator&amp;gt;Hans Jensen&amp;lt;/dc:creator&amp;gt;
&amp;lt;dc:type
xml:lang=&amp;quot;da&amp;quot;&amp;gt;person&amp;lt;/dc:type&amp;gt;
&amp;lt;dc:type
xml:lang=&amp;quot;en&amp;quot;&amp;gt;person&amp;lt;/dc:type&amp;gt;
&amp;lt;dc:identifier&amp;gt;hj@example.com&amp;lt;/dc:identifier&amp;gt;
&amp;lt;/rdf:Description&amp;gt;
&amp;lt;/rdf:RDF&amp;gt;
&amp;lt;/shortrecord&amp;gt;</field>
  </record>
</documentresult>
</response>
</responsecollection>

Currently, we don't have a schema for the result set.

The result can be read as follows:

documentresult element

  • Attributes query, startIndex, maxRecords, sortKey, reverseSort: Same as input to method
  • Attribute fields: Always "recordID, shortformat"
  • Attribute searchTime: Time it took to search
  • Attribute hitCount: Number of results

record element

  • Attribute score: value from 0 to 1 with relevancy ranking
  • Attributes id, name: Not used

field element

  • Attribute name: Always either recordID or shortformat
  • Contents are the PID for recordID, or XML for shortformat.

The format of the "shortformat" field is (decoded version of contents above):

<shortrecord>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description>
<dc:title>Fagekspert i Datalogi</dc:title>
<dc:creator>Hans Jensen</dc:creator>
<dc:type
xml:lang="da">person</dc:type>
<dc:type
xml:lang="en">person</dc:type>
<dc:identifier>hj@example.com</dc:identifier>
</rdf:Description>
</rdf:RDF>
</shortrecord>

Important elements are the "dc" fields. They will contain the actual results.

Anchor(example)

Result XML Example

Search API (last edited 2010-03-17 13:09:38 by localhost)