Differences between revisions 3 and 4
Revision 3 as of 2008-10-17 10:11:17
Size: 3911
Editor: bam
Comment: Search API Work in Progress
Revision 4 as of 2008-10-17 11:55:26
Size: 8931
Editor: bam
Comment: search API documentation
Deletions are marked like this. Additions are marked like this.
Line 9: Line 9:
 * [#resultXML Result XML Definition]  * [#resultXML Result XML Description]
Line 24: Line 24:
 * {{{String simpleSearchReturn}}} The search result sorted by relevance as structured XML document. See [#resultXML definition] below.  * {{{String simpleSearchReturn}}} The search result sorted by relevance as structured XML document. See [#resultXML description] below.
Line 40: Line 40:
 * {{{String simpleSearchReturn}}} The search result sorted by the given key sort, reversed if so indicated, as structured XML document. See [#resultXML definition] below.  * {{{String simpleSearchReturn}}} The search result sorted by the given key, reversed if so indicated, as structured XML document. See [#resultXML description] below.
Line 46: Line 46:
== Result XML ==

TODO

The result string is actually XML, in the following form:

{{{
== Result XML Description ==

The result string defined by Summa is XML, in the following form:

{{{
<?xml version="1.0" encoding="UTF-8" ?>
<responsecollection>
  <response> response-xml-1 </response>
  <response> response-xml-2 </response>
  ...
<responsecollection>
}}}

Possible responses are document response, facet result and others. In DOMS we only use document response, which looks like this:
{{{
<documentresult filter="..." query="..." startIndex="..." maxRecords="..." sortKey="..."
                reverseSort="..." fields="..." searchTime="..." hitCount="...">
  <record score="..." sortValue="...">
    <field name="recordID">...</field>
    <field name="shortformat">...</field>
  </record>
  ...
</documentresult>
}}}

Currently, we do not have a schema for the result. The result can be read as follows:

documentresult element
 * Attribute {{{filter}}} is not used in simple search results.
 * Attributes {{{query}}}, {{{startIndex}}}, {{{maxRecords}}}, {{{sortKey}}}, {{{reverseSort}}}: Same as input to method.
 * Attribute {{{fields}}}: Always "recordID, shortformat" in DOMS.
 * Attribute {{{searchTime}}}: Time it took to search.
 * Attribute {{{hitCount}}}: Number of results.

record element
 * Attribute {{{score}}}: relevancy ranking, value from 0 to 1.
 * Attribute {{{sortValue}}} is the value that the sort was performed on.

field element
 * Attribute {{{name}}}: In DOMS always either recordID or shortformat.
 * Contents are the PID for recordID, or XML for shortformat.

The XML for shortformat is of the following form:
{{{
<shortrecord>
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/">
    <rdf:Description>
      <dc:title>...</dc:title>
      <dc:creator>...</dc:creator>
      <dc:date>...</dc:date>
      <dc:type xml:lang="da">netdokument</dc:type>
      <dc:type xml:lang="en">net document</dc:type>
      <dc:identifier>...</dc:identifier>
      ...
    </rdf:Description>
  </rdf:RDF>
</shortrecord>
}}}

The important elements are the "dc" fields. They will contain the actual results.

[[Anchor(example)]]
== Result XML Example ==
This example is the same as the one given by the [http://wiki.statsbiblioteket.dk/summa/Community/Tutorials/MinimalDeployment Summa Minimal Deployment Tutorial], except without the facet result response.
##New Example:
{{{
<?xml version="1.0" encoding="UTF-8" ?>
Line 55: Line 115:
<documentresult query="Hans" startIndex="0" maxRecords="20"
sortKey="summa-score" reverseSort="false" fields="recordID, shortformat"
searchTime="105" hitCount="1">
  <record score="0.37572" id="0" source="NA">
    <field name="recordID">fagref:hj@example.com</field>
    <field name="shortformat">&amp;lt;shortrecord&amp;gt;
&amp;lt;rdf:RDF
xmlns:rdf=&amp;quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;&gt;
&amp;lt;rdf:Description&amp;gt;
&amp;lt;dc:title&amp;gt;Fagekspert i Datalogi&amp;lt;/dc:title&amp;gt;
&amp;lt;dc:creator&amp;gt;Hans Jensen&amp;lt;/dc:creator&amp;gt;
&amp;lt;dc:type
xml:lang=&amp;quot;da&amp;quot;&amp;gt;person&amp;lt;/dc:type&amp;gt;
&amp;lt;dc:type
xml:lang=&amp;quot;en&amp;quot;&amp;gt;person&amp;lt;/dc:type&amp;gt;
&amp;lt;dc:identifier&amp;gt;hj@example.com&amp;lt;/dc:identifier&amp;gt;
&amp;lt;/rdf:Description&amp;gt;
&amp;lt;/rdf:RDF&amp;gt;
&amp;lt;/shortrecord&amp;gt;</field>
<documentresult query="narrative" startIndex="0" maxRecords="20" sortKey="summa-score" reverseSort="false" fields="main_titel, lsubject, lsu_oai, author_normalised, recordID, shortformat" searchTime="8" hitCount="2">
  <record score="0.20924361" id="122" source="NA">
    <field name="main_titel">Pensare per immagini: una strada per la coscienza</field>
    <field name="lsubject">NoSubject</field>
    <field name="lsu_oai">NoOAI</field>
    <field name="author_normalised">Ferdinando Testa</field>
    <field name="recordID">oai:oai:doaj-articles:badd9ac32fc2e096cf76fec4f0d19250</field>
    <field name="shortformat"><shortrecord>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description>
<dc:title xmlns:dc="http://purl.org/dc/elements/1.1/">Pensare per immagini: una strada per la coscienza</dc:title>
<dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Ferdinando Testa</dc:creator>
<dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2005</dc:date>
<dc:type xml:lang="da" xmlns:dc="http://purl.org/dc/elements/1.1/">netdokument</dc:type>
<dc:type xml:lang="en" xmlns:dc="http://purl.org/dc/elements/1.1/">net document</dc:type>
<dc:identifier xmlns:dc="http://purl.org/dc/elements/1.1/">http://www.analisiqualitativa.com/magma/0304/articolo_01.htm</dc:identifier>
<dc:identifier xmlns:dc="http://purl.org/dc/elements/1.1/">http://www.doaj.org/doaj?func=openurl&amp;genre=article&amp;issn=17219809&amp;date=2005&amp;volume=03&amp;issue=04&amp;spage=</dc:identifier>
<dc:format xmlns:dc="http://purl.org/dc/elements/1.1/">todo</dc:format>
</rdf:Description>
</rdf:RDF>
</shortrecord></field>
  </record>
  <record score="0.20924361" id="149" source="NA">
    <field name="main_titel">La narrazione: dimensione ontologica della formazione</field>
    <field name="lsubject">NoSubject</field>
    <field name="lsu_oai">NoOAI</field>
    <field name="author_normalised">Francesca Pulvirenti</field>
    <field name="recordID">oai:oai:doaj-articles:dd2dffe34df1293e045aee58f06a5c3f</field>
    <field name="shortformat"><shortrecord>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description>
<dc:title xmlns:dc="http://purl.org/dc/elements/1.1/">La narrazione: dimensione ontologica della formazione</dc:title>
<dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Francesca Pulvirenti</dc:creator>
<dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2005</dc:date>
<dc:type xml:lang="da" xmlns:dc="http://purl.org/dc/elements/1.1/">netdokument</dc:type>
<dc:type xml:lang="en" xmlns:dc="http://purl.org/dc/elements/1.1/">net document</dc:type>
<dc:identifier xmlns:dc="http://purl.org/dc/elements/1.1/">http://www.analisiqualitativa.com/magma/0303/editoriale.htm</dc:identifier>
<dc:identifier xmlns:dc="http://purl.org/dc/elements/1.1/">http://www.doaj.org/doaj?func=openurl&amp;genre=article&amp;issn=17219809&amp;date=2005&amp;volume=03&amp;issue=03&amp;spage=</dc:identifier>
<dc:format xmlns:dc="http://purl.org/dc/elements/1.1/">todo</dc:format>
</rdf:Description>
</rdf:RDF>
</shortrecord></field>
Line 80: Line 163:
Currently, we don't have a schema for the result set.

The result can be read as follows:

documentresult element
 * Attributes query, startIndex, maxRecords, sortKey, reverseSort: Same as input to method
 * Attribute fields: Always "recordID, shortformat"
 * Attribute searchTime: Time it took to search
 * Attribute hitCount: Number of results

record element
 * Attribute score: value from 0 to 1 with relevancy ranking
 * Attributes id, name: Not used

field element
 * Attribute name: Always either recordID or shortformat
 * Contents are the PID for recordID, or XML for shortformat.

The format of the "shortformat" field is (decoded version of contents above):

{{{
<shortrecord>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description>
<dc:title>Fagekspert i Datalogi</dc:title>
<dc:creator>Hans Jensen</dc:creator>
<dc:type
xml:lang="da">person</dc:type>
<dc:type
xml:lang="en">person</dc:type>
<dc:identifier>hj@example.com</dc:identifier>
</rdf:Description>
</rdf:RDF>
</shortrecord>
}}}

Important elements are the "dc" fields. They will contain the actual results.

[[Anchor(example)]]
== Result XML Example ==

##Old Example:
##{{{
##<responsecollection>
##<response name="DocumentResponse">
##<documentresult query="Hans" startIndex="0" maxRecords="20"
##sortKey="summa-score" reverseSort="false" fields="recordID, shortformat"
##searchTime="105" hitCount="1">
## <record score="0.37572" id="0" source="NA">
## <field name="recordID">fagref:hj@example.com</field>
## <field name="shortformat">&amp;lt;shortrecord&amp;gt;
##&amp;lt;rdf:RDF
##xmlns:rdf=&amp;quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;&gt;
##&amp;lt;rdf:Description&amp;gt;
##&amp;lt;dc:title&amp;gt;Fagekspert i Datalogi&amp;lt;/dc:title&amp;gt;
##&amp;lt;dc:creator&amp;gt;Hans Jensen&amp;lt;/dc:creator&amp;gt;
##&amp;lt;dc:type
##xml:lang=&amp;quot;da&amp;quot;&amp;gt;person&amp;lt;/dc:type&amp;gt;
##&amp;lt;dc:type
##xml:lang=&amp;quot;en&amp;quot;&amp;gt;person&amp;lt;/dc:type&amp;gt;
##&amp;lt;dc:identifier&amp;gt;hj@example.com&amp;lt;/dc:identifier&amp;gt;
##&amp;lt;/rdf:Description&amp;gt;
##&amp;lt;/rdf:RDF&amp;gt;
##&amp;lt;/shortrecord&amp;gt;</field>
## </record>
##</documentresult>
##</response>
##</responsecollection>
##}}}
##
##The format of the "shortformat" field is (decoded version of contents above):
##
##{{{
##<shortrecord>
##<rdf:RDF
##xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
##<rdf:Description>
##<dc:title>Fagekspert i Datalogi</dc:title>
##<dc:creator>Hans Jensen</dc:creator>
##<dc:type
##xml:lang="da">person</dc:type>
##<dc:type
##xml:lang="en">person</dc:type>
##<dc:identifier>hj@example.com</dc:identifier>
##</rdf:Description>
##</rdf:RDF>
##</shortrecord>
##}}}

Search API

DOMS Search uses simple search methods of the Summa Search interface.

WSDL: attachment:DomsGUISearch.wsdl

Content of this page:

  • [#operations Operations]
  • [#resultXML Result XML Description]
  • [#example Result XML Example]

Anchor(operations)

Operations

simpleSearch

This method executes the given query and returns a search result ranked by relevance.

Input parameters:

  • String query The query string.

  • int numberOfRecords The maximum number of records returned in search result.

  • int startIndex The number of the first record to return.

Returns:

  • String simpleSearchReturn The search result sorted by relevance as structured XML document. See [#resultXML description] below.

Throws:

  • java.rmi.RemoteException

simpleSearchSorted

This method executes the given query and returns a search result ranked by the given sort key.

Input parameters:

  • String query The query string.

  • int numberOfRecords The maximum number of records returned in search result.

  • int startIndex The number of the first record to return.

  • String sortKey The key to sort by.

  • boolean reverse A boolean indication whether or not to sort in reverse.

Returns:

  • String simpleSearchReturn The search result sorted by the given key, reversed if so indicated, as structured XML document. See [#resultXML description] below.

Throws:

  • java.rmi.RemoteException

Anchor(resultXML)

Result XML Description

The result string defined by Summa is XML, in the following form:

<?xml version="1.0" encoding="UTF-8" ?>  
<responsecollection>  
  <response>  response-xml-1 </response>  
  <response>  response-xml-2 </response>  
  ... 
<responsecollection> 

Possible responses are document response, facet result and others. In DOMS we only use document response, which looks like this:

<documentresult filter="..." query="..." startIndex="..." maxRecords="..." sortKey="..." 
                reverseSort="..." fields="..." searchTime="..." hitCount="...">  
  <record score="..." sortValue="...">  
    <field name="recordID">...</field>  
    <field name="shortformat">...</field>  
  </record>  
  ... 
</documentresult>

Currently, we do not have a schema for the result. The result can be read as follows:

documentresult element

  • Attribute filter is not used in simple search results.

  • Attributes query, startIndex, maxRecords, sortKey, reverseSort: Same as input to method.

  • Attribute fields: Always "recordID, shortformat" in DOMS.

  • Attribute searchTime: Time it took to search.

  • Attribute hitCount: Number of results.

record element

  • Attribute score: relevancy ranking, value from 0 to 1.

  • Attribute sortValue is the value that the sort was performed on.

field element

  • Attribute name: In DOMS always either recordID or shortformat.

  • Contents are the PID for recordID, or XML for shortformat.

The XML for shortformat is of the following form:

<shortrecord>
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/">
    <rdf:Description>
      <dc:title>...</dc:title>
      <dc:creator>...</dc:creator>
      <dc:date>...</dc:date>
      <dc:type xml:lang="da">netdokument</dc:type>
      <dc:type xml:lang="en">net document</dc:type>
      <dc:identifier>...</dc:identifier>
      ...
    </rdf:Description>
  </rdf:RDF>
</shortrecord>

The important elements are the "dc" fields. They will contain the actual results.

Anchor(example)

Result XML Example

This example is the same as the one given by the [http://wiki.statsbiblioteket.dk/summa/Community/Tutorials/MinimalDeployment Summa Minimal Deployment Tutorial], except without the facet result response.

<?xml version="1.0" encoding="UTF-8" ?>
<responsecollection>
<response name="DocumentResponse">
<documentresult query="narrative" startIndex="0" maxRecords="20" sortKey="summa-score" reverseSort="false" fields="main_titel, lsubject, lsu_oai, author_normalised, recordID, shortformat" searchTime="8" hitCount="2">
  <record score="0.20924361" id="122" source="NA">
    <field name="main_titel">Pensare per immagini: una strada per la coscienza</field>
    <field name="lsubject">NoSubject</field>
    <field name="lsu_oai">NoOAI</field>
    <field name="author_normalised">Ferdinando Testa</field>
    <field name="recordID">oai:oai:doaj-articles:badd9ac32fc2e096cf76fec4f0d19250</field>
    <field name="shortformat"><shortrecord>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description>
<dc:title xmlns:dc="http://purl.org/dc/elements/1.1/">Pensare per immagini: una strada per la coscienza</dc:title>
<dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Ferdinando Testa</dc:creator>
<dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2005</dc:date>
<dc:type xml:lang="da" xmlns:dc="http://purl.org/dc/elements/1.1/">netdokument</dc:type>
<dc:type xml:lang="en" xmlns:dc="http://purl.org/dc/elements/1.1/">net document</dc:type>
<dc:identifier xmlns:dc="http://purl.org/dc/elements/1.1/">http://www.analisiqualitativa.com/magma/0304/articolo_01.htm</dc:identifier>
<dc:identifier xmlns:dc="http://purl.org/dc/elements/1.1/">http://www.doaj.org/doaj?func=openurl&amp;genre=article&amp;issn=17219809&amp;date=2005&amp;volume=03&amp;issue=04&amp;spage=</dc:identifier>
<dc:format xmlns:dc="http://purl.org/dc/elements/1.1/">todo</dc:format>
</rdf:Description>
</rdf:RDF>
</shortrecord></field>
  </record>
  <record score="0.20924361" id="149" source="NA">
    <field name="main_titel">La narrazione: dimensione ontologica della formazione</field>
    <field name="lsubject">NoSubject</field>
    <field name="lsu_oai">NoOAI</field>
    <field name="author_normalised">Francesca Pulvirenti</field>
    <field name="recordID">oai:oai:doaj-articles:dd2dffe34df1293e045aee58f06a5c3f</field>
    <field name="shortformat"><shortrecord>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description>
<dc:title xmlns:dc="http://purl.org/dc/elements/1.1/">La narrazione: dimensione ontologica della formazione</dc:title>
<dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Francesca Pulvirenti</dc:creator>
<dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2005</dc:date>
<dc:type xml:lang="da" xmlns:dc="http://purl.org/dc/elements/1.1/">netdokument</dc:type>
<dc:type xml:lang="en" xmlns:dc="http://purl.org/dc/elements/1.1/">net document</dc:type>
<dc:identifier xmlns:dc="http://purl.org/dc/elements/1.1/">http://www.analisiqualitativa.com/magma/0303/editoriale.htm</dc:identifier>
<dc:identifier xmlns:dc="http://purl.org/dc/elements/1.1/">http://www.doaj.org/doaj?func=openurl&amp;genre=article&amp;issn=17219809&amp;date=2005&amp;volume=03&amp;issue=03&amp;spage=</dc:identifier>
<dc:format xmlns:dc="http://purl.org/dc/elements/1.1/">todo</dc:format>
</rdf:Description>
</rdf:RDF>
</shortrecord></field>
  </record>
</documentresult>
</response>
</responsecollection>

Search API (last edited 2010-03-17 13:09:38 by localhost)