⇤ ← Revision 1 as of 2008-06-26 12:26:10
Size: 2109
Comment: Created by the PackagePages action.
|
Size: 7630
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
An API for searching DOMS needs to be provided. | Searching Fedora is possible, using a simple method. Simply give a string, and get results back encoded in XML |
Line 5: | Line 5: |
Two possibilities exist * Using GSearch - http://defxws2006.cvt.dk/fedoragsearch/ * Pros: Easily set up, simple interface * Cons: Searches on a one-fedora-object basis * Using Summa - https://gforge.statsbiblioteket.dk/projects/summa * Pros: Searches on entire metadata records * Cons: Difficult to setup, more complicated interface Probably, Summa is the best bet. Summa webservice search interface WSDL extract: |
WSDL: |
Line 19: | Line 8: |
<?xml version="1.0" encoding="UTF-8"?> <wsdl:definitions targetNamespace="http://statsbiblioteket.dk/summa/search" xmlns:apachesoap="http://xml.apache.org/xml-soap" xmlns:impl="http://statsbiblioteket.dk/summa/search" xmlns:intf="http://statsbiblioteket.dk/summa/search" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:wsdlsoap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <!--WSDL created by Apache Axis version: 1.4 Built on Apr 22, 2006 (06:55:48 PDT)--> <wsdl:types> <schema elementFormDefault="qualified" targetNamespace="http://statsbiblioteket.dk/summa/search" xmlns="http://www.w3.org/2001/XMLSchema"> |
|
Line 53: | Line 48: |
</schema> </wsdl:types> <wsdl:message name="simpleSearchSortedRequest"> <wsdl:part element="impl:simpleSearchSorted" name="parameters"/> </wsdl:message> <wsdl:message name="simpleSearchRequest"> <wsdl:part element="impl:simpleSearch" name="parameters"/> </wsdl:message> <wsdl:message name="simpleSearchSortedResponse"> <wsdl:part element="impl:simpleSearchSortedResponse" name="parameters"/> </wsdl:message> <wsdl:message name="simpleSearchResponse"> <wsdl:part element="impl:simpleSearchResponse" name="parameters"/> </wsdl:message> <wsdl:portType name="SearchWS"> <wsdl:operation name="simpleSearch"> <wsdl:documentation xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"> This method executes the given query and returns a search result ranked by relevancy. query: The query string. numberOfRecords: The maximum number of records returned in the search result. startIndex: The number of the first record to return. </wsdl:documentation> <wsdl:input message="impl:simpleSearchRequest" name="simpleSearchRequest"/> <wsdl:output message="impl:simpleSearchResponse" name="simpleSearchResponse"/> </wsdl:operation> <wsdl:operation name="simpleSearchSorted"> <wsdl:documentation xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"> This method executes the given query and returns a search result ranked by the given sort key. query: The query string. numberOfRecords: The maximum number of records returned in the search result. startIndex: The number of the first record to return. sortKey: The key to sort by. reverse: A boolean indication whether or not to sort in reverse. </wsdl:documentation> <wsdl:input message="impl:simpleSearchSortedRequest" name="simpleSearchSortedRequest"/> <wsdl:output message="impl:simpleSearchSortedResponse" name="simpleSearchSortedResponse"/> </wsdl:operation> </wsdl:portType> <wsdl:binding name="SearchWSSoapBinding" type="impl:SearchWS"> <wsdlsoap:binding style="document" transport="http://schemas.xmlsoap.org/soap/http"/> <wsdl:operation name="simpleSearch"> <wsdlsoap:operation soapAction=""/> <wsdl:input name="simpleSearchRequest"> <wsdlsoap:body use="literal"/> </wsdl:input> <wsdl:output name="simpleSearchResponse"> <wsdlsoap:body use="literal"/> </wsdl:output> </wsdl:operation> <wsdl:operation name="simpleSearchSorted"> <wsdlsoap:operation soapAction=""/> <wsdl:input name="simpleSearchSortedRequest"> <wsdlsoap:body use="literal"/> </wsdl:input> <wsdl:output name="simpleSearchSortedResponse"> <wsdlsoap:body use="literal"/> </wsdl:output> </wsdl:operation> </wsdl:binding> <wsdl:service name="SearchWSService"> <wsdl:documentation xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"> Search web service for the Summa system </wsdl:documentation> <wsdl:port binding="impl:SearchWSSoapBinding" name="SearchWS"> <wsdlsoap:address location="http://localhost:8080/summa-web-search/services/SearchWS"/> </wsdl:port> </wsdl:service> </wsdl:definitions> |
|
Line 55: | Line 117: |
Result will be of the form |
The result string is actually XML, in the following form: |
Line 60: | Line 120: |
<?xml version="1.0" encoding="UTF-8"?> <searchresult filter="..." query="..." startIndex="..." maxRecords="..." sortKey="..." reverseSort="..." fields="..." searchTime="..." hitCount="..."> <record score="..." sortValue="..."> <field name="recordID">...</field> <field name="shortformat">...</field> </record> ... </searchresult> |
<responsecollection> <response name="DocumentResponse"> <documentresult query="Hans" startIndex="0" maxRecords="20" sortKey="summa-score" reverseSort="false" fields="recordID, shortformat" searchTime="105" hitCount="1"> <record score="0.37572" id="0" source="NA"> <field name="recordID">fagref:hj@example.com</field> <field name="shortformat">&lt;shortrecord&gt; &lt;rdf:RDF xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#"> &lt;rdf:Description&gt; &lt;dc:title&gt;Fagekspert i Datalogi&lt;/dc:title&gt; &lt;dc:creator&gt;Hans Jensen&lt;/dc:creator&gt; &lt;dc:type xml:lang=&quot;da&quot;&gt;person&lt;/dc:type&gt; &lt;dc:type xml:lang=&quot;en&quot;&gt;person&lt;/dc:type&gt; &lt;dc:identifier&gt;hj@example.com&lt;/dc:identifier&gt; &lt;/rdf:Description&gt; &lt;/rdf:RDF&gt; &lt;/shortrecord&gt;</field> </record> </documentresult> </response> </responsecollection> |
Line 72: | Line 146: |
Currently, we don't have a schema for the result set. The result can be read as follows: documentresult element * Attributes query, startIndex, maxRecords, sortKey, reverseSort: Same as input to method * Attribute fields: Always "recordID, shortformat" * Attribute searchTime: Time it took to search * Attribute hitCount: Number of results record element * Attribute score: value from 0 to 1 with relevancy ranking * Attributes id, name: Not used field element * Attribute name: Always either recordID or shortformat * Contents are the PID for recordID, or XML for shortformat. The format of the "shortformat" field is (decoded version of contents above): {{{ <shortrecord> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <dc:title>Fagekspert i Datalogi</dc:title> <dc:creator>Hans Jensen</dc:creator> <dc:type xml:lang="da">person</dc:type> <dc:type xml:lang="en">person</dc:type> <dc:identifier>hj@example.com</dc:identifier> </rdf:Description> </rdf:RDF> </shortrecord> }}} Important elements are the "dc" fields. They will contain the actual results. |
Search API
Searching Fedora is possible, using a simple method. Simply give a string, and get results back encoded in XML
WSDL:
<?xml version="1.0" encoding="UTF-8"?> <wsdl:definitions targetNamespace="http://statsbiblioteket.dk/summa/search" xmlns:apachesoap="http://xml.apache.org/xml-soap" xmlns:impl="http://statsbiblioteket.dk/summa/search" xmlns:intf="http://statsbiblioteket.dk/summa/search" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:wsdlsoap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <!--WSDL created by Apache Axis version: 1.4 Built on Apr 22, 2006 (06:55:48 PDT)--> <wsdl:types> <schema elementFormDefault="qualified" targetNamespace="http://statsbiblioteket.dk/summa/search" xmlns="http://www.w3.org/2001/XMLSchema"> <element name="simpleSearch"> <complexType> <sequence> <element name="query" type="xsd:string"/> <element name="numberOfRecords" type="xsd:int"/> <element name="startIndex" type="xsd:int"/> </sequence> </complexType> </element> <element name="simpleSearchResponse"> <complexType> <sequence> <element name="simpleSearchReturn" type="xsd:string"/> </sequence> </complexType> </element> <element name="simpleSearchSorted"> <complexType> <sequence> <element name="query" type="xsd:string"/> <element name="numberOfRecords" type="xsd:int"/> <element name="startIndex" type="xsd:int"/> <element name="sortKey" type="xsd:string"/> <element name="reverse" type="xsd:boolean"/> </sequence> </complexType> </element> <element name="simpleSearchSortedResponse"> <complexType> <sequence> <element name="simpleSearchSortedReturn" type="xsd:string"/> </sequence> </complexType> </element> </schema> </wsdl:types> <wsdl:message name="simpleSearchSortedRequest"> <wsdl:part element="impl:simpleSearchSorted" name="parameters"/> </wsdl:message> <wsdl:message name="simpleSearchRequest"> <wsdl:part element="impl:simpleSearch" name="parameters"/> </wsdl:message> <wsdl:message name="simpleSearchSortedResponse"> <wsdl:part element="impl:simpleSearchSortedResponse" name="parameters"/> </wsdl:message> <wsdl:message name="simpleSearchResponse"> <wsdl:part element="impl:simpleSearchResponse" name="parameters"/> </wsdl:message> <wsdl:portType name="SearchWS"> <wsdl:operation name="simpleSearch"> <wsdl:documentation xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"> This method executes the given query and returns a search result ranked by relevancy. query: The query string. numberOfRecords: The maximum number of records returned in the search result. startIndex: The number of the first record to return. </wsdl:documentation> <wsdl:input message="impl:simpleSearchRequest" name="simpleSearchRequest"/> <wsdl:output message="impl:simpleSearchResponse" name="simpleSearchResponse"/> </wsdl:operation> <wsdl:operation name="simpleSearchSorted"> <wsdl:documentation xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"> This method executes the given query and returns a search result ranked by the given sort key. query: The query string. numberOfRecords: The maximum number of records returned in the search result. startIndex: The number of the first record to return. sortKey: The key to sort by. reverse: A boolean indication whether or not to sort in reverse. </wsdl:documentation> <wsdl:input message="impl:simpleSearchSortedRequest" name="simpleSearchSortedRequest"/> <wsdl:output message="impl:simpleSearchSortedResponse" name="simpleSearchSortedResponse"/> </wsdl:operation> </wsdl:portType> <wsdl:binding name="SearchWSSoapBinding" type="impl:SearchWS"> <wsdlsoap:binding style="document" transport="http://schemas.xmlsoap.org/soap/http"/> <wsdl:operation name="simpleSearch"> <wsdlsoap:operation soapAction=""/> <wsdl:input name="simpleSearchRequest"> <wsdlsoap:body use="literal"/> </wsdl:input> <wsdl:output name="simpleSearchResponse"> <wsdlsoap:body use="literal"/> </wsdl:output> </wsdl:operation> <wsdl:operation name="simpleSearchSorted"> <wsdlsoap:operation soapAction=""/> <wsdl:input name="simpleSearchSortedRequest"> <wsdlsoap:body use="literal"/> </wsdl:input> <wsdl:output name="simpleSearchSortedResponse"> <wsdlsoap:body use="literal"/> </wsdl:output> </wsdl:operation> </wsdl:binding> <wsdl:service name="SearchWSService"> <wsdl:documentation xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"> Search web service for the Summa system </wsdl:documentation> <wsdl:port binding="impl:SearchWSSoapBinding" name="SearchWS"> <wsdlsoap:address location="http://localhost:8080/summa-web-search/services/SearchWS"/> </wsdl:port> </wsdl:service> </wsdl:definitions>
The result string is actually XML, in the following form:
<responsecollection> <response name="DocumentResponse"> <documentresult query="Hans" startIndex="0" maxRecords="20" sortKey="summa-score" reverseSort="false" fields="recordID, shortformat" searchTime="105" hitCount="1"> <record score="0.37572" id="0" source="NA"> <field name="recordID">fagref:hj@example.com</field> <field name="shortformat">&lt;shortrecord&gt; &lt;rdf:RDF xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#"> &lt;rdf:Description&gt; &lt;dc:title&gt;Fagekspert i Datalogi&lt;/dc:title&gt; &lt;dc:creator&gt;Hans Jensen&lt;/dc:creator&gt; &lt;dc:type xml:lang=&quot;da&quot;&gt;person&lt;/dc:type&gt; &lt;dc:type xml:lang=&quot;en&quot;&gt;person&lt;/dc:type&gt; &lt;dc:identifier&gt;hj@example.com&lt;/dc:identifier&gt; &lt;/rdf:Description&gt; &lt;/rdf:RDF&gt; &lt;/shortrecord&gt;</field> </record> </documentresult> </response> </responsecollection>
Currently, we don't have a schema for the result set.
The result can be read as follows:
documentresult element
- Attributes query, startIndex, maxRecords, sortKey, reverseSort: Same as input to method
- Attribute fields: Always "recordID, shortformat"
- Attribute searchTime: Time it took to search
- Attribute hitCount: Number of results
record element
- Attribute score: value from 0 to 1 with relevancy ranking
- Attributes id, name: Not used
field element
- Attribute name: Always either recordID or shortformat
- Contents are the PID for recordID, or XML for shortformat.
The format of the "shortformat" field is (decoded version of contents above):
<shortrecord> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <dc:title>Fagekspert i Datalogi</dc:title> <dc:creator>Hans Jensen</dc:creator> <dc:type xml:lang="da">person</dc:type> <dc:type xml:lang="en">person</dc:type> <dc:identifier>hj@example.com</dc:identifier> </rdf:Description> </rdf:RDF> </shortrecord>
Important elements are the "dc" fields. They will contain the actual results.