Size: 2109
Comment: Created by the PackagePages action.
|
Size: 3911
Comment: Search API Work in Progress
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
An API for searching DOMS needs to be provided. | DOMS Search uses simple search methods of the Summa Search interface. |
Line 5: | Line 5: |
Two possibilities exist | WSDL: attachment:DomsGUISearch.wsdl |
Line 7: | Line 7: |
* Using GSearch - http://defxws2006.cvt.dk/fedoragsearch/ * Pros: Easily set up, simple interface * Cons: Searches on a one-fedora-object basis * Using Summa - https://gforge.statsbiblioteket.dk/projects/summa * Pros: Searches on entire metadata records * Cons: Difficult to setup, more complicated interface |
Content of this page: * [#operations Operations] * [#resultXML Result XML Definition] * [#example Result XML Example] |
Line 14: | Line 12: |
Probably, Summa is the best bet. | [[Anchor(operations)]] == Operations == |
Line 16: | Line 15: |
Summa webservice search interface WSDL extract: | === simpleSearch === This method executes the given query and returns a search result ranked by relevance. Input parameters: * {{{String query}}} The query string. * {{{int numberOfRecords}}} The maximum number of records returned in search result. * {{{int startIndex}}} The number of the first record to return. Returns: * {{{String simpleSearchReturn}}} The search result sorted by relevance as structured XML document. See [#resultXML definition] below. Throws: * {{{java.rmi.RemoteException}}} === simpleSearchSorted === This method executes the given query and returns a search result ranked by the given sort key. Input parameters: * {{{String query}}} The query string. * {{{int numberOfRecords}}} The maximum number of records returned in search result. * {{{int startIndex}}} The number of the first record to return. * {{{String sortKey}}} The key to sort by. * {{{boolean reverse}}} A boolean indication whether or not to sort in reverse. Returns: * {{{String simpleSearchReturn}}} The search result sorted by the given key sort, reversed if so indicated, as structured XML document. See [#resultXML definition] below. Throws: * {{{java.rmi.RemoteException}}} [[Anchor(resultXML)]] == Result XML == TODO The result string is actually XML, in the following form: |
Line 19: | Line 53: |
<element name="simpleSearch"> <complexType> <sequence> <element name="query" type="xsd:string"/> <element name="numberOfRecords" type="xsd:int"/> <element name="startIndex" type="xsd:int"/> </sequence> </complexType> </element> <element name="simpleSearchResponse"> <complexType> <sequence> <element name="simpleSearchReturn" type="xsd:string"/> </sequence> </complexType> </element> <element name="simpleSearchSorted"> <complexType> <sequence> <element name="query" type="xsd:string"/> <element name="numberOfRecords" type="xsd:int"/> <element name="startIndex" type="xsd:int"/> <element name="sortKey" type="xsd:string"/> <element name="reverse" type="xsd:boolean"/> </sequence> </complexType> </element> <element name="simpleSearchSortedResponse"> <complexType> <sequence> <element name="simpleSearchSortedReturn" type="xsd:string"/> </sequence> </complexType> </element> |
<responsecollection> <response name="DocumentResponse"> <documentresult query="Hans" startIndex="0" maxRecords="20" sortKey="summa-score" reverseSort="false" fields="recordID, shortformat" searchTime="105" hitCount="1"> <record score="0.37572" id="0" source="NA"> <field name="recordID">fagref:hj@example.com</field> <field name="shortformat">&lt;shortrecord&gt; &lt;rdf:RDF xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#"> &lt;rdf:Description&gt; &lt;dc:title&gt;Fagekspert i Datalogi&lt;/dc:title&gt; &lt;dc:creator&gt;Hans Jensen&lt;/dc:creator&gt; &lt;dc:type xml:lang=&quot;da&quot;&gt;person&lt;/dc:type&gt; &lt;dc:type xml:lang=&quot;en&quot;&gt;person&lt;/dc:type&gt; &lt;dc:identifier&gt;hj@example.com&lt;/dc:identifier&gt; &lt;/rdf:Description&gt; &lt;/rdf:RDF&gt; &lt;/shortrecord&gt;</field> </record> </documentresult> </response> </responsecollection> |
Line 55: | Line 80: |
Currently, we don't have a schema for the result set. | |
Line 56: | Line 82: |
Result will be of the form | The result can be read as follows: |
Line 58: | Line 84: |
documentresult element * Attributes query, startIndex, maxRecords, sortKey, reverseSort: Same as input to method * Attribute fields: Always "recordID, shortformat" * Attribute searchTime: Time it took to search * Attribute hitCount: Number of results record element * Attribute score: value from 0 to 1 with relevancy ranking * Attributes id, name: Not used field element * Attribute name: Always either recordID or shortformat * Contents are the PID for recordID, or XML for shortformat. The format of the "shortformat" field is (decoded version of contents above): |
|
Line 60: | Line 101: |
<?xml version="1.0" encoding="UTF-8"?> <searchresult filter="..." query="..." startIndex="..." maxRecords="..." sortKey="..." reverseSort="..." fields="..." searchTime="..." hitCount="..."> <record score="..." sortValue="..."> <field name="recordID">...</field> <field name="shortformat">...</field> </record> ... </searchresult> |
<shortrecord> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <dc:title>Fagekspert i Datalogi</dc:title> <dc:creator>Hans Jensen</dc:creator> <dc:type xml:lang="da">person</dc:type> <dc:type xml:lang="en">person</dc:type> <dc:identifier>hj@example.com</dc:identifier> </rdf:Description> </rdf:RDF> </shortrecord> |
Line 72: | Line 116: |
Important elements are the "dc" fields. They will contain the actual results. [[Anchor(example)]] == Result XML Example == |
Search API
DOMS Search uses simple search methods of the Summa Search interface.
WSDL: attachment:DomsGUISearch.wsdl
Content of this page:
- [#operations Operations]
- [#resultXML Result XML Definition]
- [#example Result XML Example]
Operations
simpleSearch
This method executes the given query and returns a search result ranked by relevance.
Input parameters:
String query The query string.
int numberOfRecords The maximum number of records returned in search result.
int startIndex The number of the first record to return.
Returns:
String simpleSearchReturn The search result sorted by relevance as structured XML document. See [#resultXML definition] below.
Throws:
java.rmi.RemoteException
simpleSearchSorted
This method executes the given query and returns a search result ranked by the given sort key.
Input parameters:
String query The query string.
int numberOfRecords The maximum number of records returned in search result.
int startIndex The number of the first record to return.
String sortKey The key to sort by.
boolean reverse A boolean indication whether or not to sort in reverse.
Returns:
String simpleSearchReturn The search result sorted by the given key sort, reversed if so indicated, as structured XML document. See [#resultXML definition] below.
Throws:
java.rmi.RemoteException
Result XML
TODO
The result string is actually XML, in the following form:
<responsecollection> <response name="DocumentResponse"> <documentresult query="Hans" startIndex="0" maxRecords="20" sortKey="summa-score" reverseSort="false" fields="recordID, shortformat" searchTime="105" hitCount="1"> <record score="0.37572" id="0" source="NA"> <field name="recordID">fagref:hj@example.com</field> <field name="shortformat">&lt;shortrecord&gt; &lt;rdf:RDF xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#"> &lt;rdf:Description&gt; &lt;dc:title&gt;Fagekspert i Datalogi&lt;/dc:title&gt; &lt;dc:creator&gt;Hans Jensen&lt;/dc:creator&gt; &lt;dc:type xml:lang=&quot;da&quot;&gt;person&lt;/dc:type&gt; &lt;dc:type xml:lang=&quot;en&quot;&gt;person&lt;/dc:type&gt; &lt;dc:identifier&gt;hj@example.com&lt;/dc:identifier&gt; &lt;/rdf:Description&gt; &lt;/rdf:RDF&gt; &lt;/shortrecord&gt;</field> </record> </documentresult> </response> </responsecollection>
Currently, we don't have a schema for the result set.
The result can be read as follows:
documentresult element
- Attributes query, startIndex, maxRecords, sortKey, reverseSort: Same as input to method
- Attribute fields: Always "recordID, shortformat"
- Attribute searchTime: Time it took to search
- Attribute hitCount: Number of results
record element
- Attribute score: value from 0 to 1 with relevancy ranking
- Attributes id, name: Not used
field element
- Attribute name: Always either recordID or shortformat
- Contents are the PID for recordID, or XML for shortformat.
The format of the "shortformat" field is (decoded version of contents above):
<shortrecord> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <dc:title>Fagekspert i Datalogi</dc:title> <dc:creator>Hans Jensen</dc:creator> <dc:type xml:lang="da">person</dc:type> <dc:type xml:lang="en">person</dc:type> <dc:identifier>hj@example.com</dc:identifier> </rdf:Description> </rdf:RDF> </shortrecord>
Important elements are the "dc" fields. They will contain the actual results.