|
⇤ ← Revision 1 as of 2008-06-26 12:26:10
Size: 2109
Comment: Created by the PackagePages action.
|
Size: 7630
Comment:
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 3: | Line 3: |
| An API for searching DOMS needs to be provided. | Searching Fedora is possible, using a simple method. Simply give a string, and get results back encoded in XML |
| Line 5: | Line 5: |
| Two possibilities exist * Using GSearch - http://defxws2006.cvt.dk/fedoragsearch/ * Pros: Easily set up, simple interface * Cons: Searches on a one-fedora-object basis * Using Summa - https://gforge.statsbiblioteket.dk/projects/summa * Pros: Searches on entire metadata records * Cons: Difficult to setup, more complicated interface Probably, Summa is the best bet. Summa webservice search interface WSDL extract: |
WSDL: |
| Line 19: | Line 8: |
| <?xml version="1.0" encoding="UTF-8"?> <wsdl:definitions targetNamespace="http://statsbiblioteket.dk/summa/search" xmlns:apachesoap="http://xml.apache.org/xml-soap" xmlns:impl="http://statsbiblioteket.dk/summa/search" xmlns:intf="http://statsbiblioteket.dk/summa/search" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:wsdlsoap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <!--WSDL created by Apache Axis version: 1.4 Built on Apr 22, 2006 (06:55:48 PDT)--> <wsdl:types> <schema elementFormDefault="qualified" targetNamespace="http://statsbiblioteket.dk/summa/search" xmlns="http://www.w3.org/2001/XMLSchema"> |
|
| Line 53: | Line 48: |
| </schema> </wsdl:types> <wsdl:message name="simpleSearchSortedRequest"> <wsdl:part element="impl:simpleSearchSorted" name="parameters"/> </wsdl:message> <wsdl:message name="simpleSearchRequest"> <wsdl:part element="impl:simpleSearch" name="parameters"/> </wsdl:message> <wsdl:message name="simpleSearchSortedResponse"> <wsdl:part element="impl:simpleSearchSortedResponse" name="parameters"/> </wsdl:message> <wsdl:message name="simpleSearchResponse"> <wsdl:part element="impl:simpleSearchResponse" name="parameters"/> </wsdl:message> <wsdl:portType name="SearchWS"> <wsdl:operation name="simpleSearch"> <wsdl:documentation xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"> This method executes the given query and returns a search result ranked by relevancy. query: The query string. numberOfRecords: The maximum number of records returned in the search result. startIndex: The number of the first record to return. </wsdl:documentation> <wsdl:input message="impl:simpleSearchRequest" name="simpleSearchRequest"/> <wsdl:output message="impl:simpleSearchResponse" name="simpleSearchResponse"/> </wsdl:operation> <wsdl:operation name="simpleSearchSorted"> <wsdl:documentation xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"> This method executes the given query and returns a search result ranked by the given sort key. query: The query string. numberOfRecords: The maximum number of records returned in the search result. startIndex: The number of the first record to return. sortKey: The key to sort by. reverse: A boolean indication whether or not to sort in reverse. </wsdl:documentation> <wsdl:input message="impl:simpleSearchSortedRequest" name="simpleSearchSortedRequest"/> <wsdl:output message="impl:simpleSearchSortedResponse" name="simpleSearchSortedResponse"/> </wsdl:operation> </wsdl:portType> <wsdl:binding name="SearchWSSoapBinding" type="impl:SearchWS"> <wsdlsoap:binding style="document" transport="http://schemas.xmlsoap.org/soap/http"/> <wsdl:operation name="simpleSearch"> <wsdlsoap:operation soapAction=""/> <wsdl:input name="simpleSearchRequest"> <wsdlsoap:body use="literal"/> </wsdl:input> <wsdl:output name="simpleSearchResponse"> <wsdlsoap:body use="literal"/> </wsdl:output> </wsdl:operation> <wsdl:operation name="simpleSearchSorted"> <wsdlsoap:operation soapAction=""/> <wsdl:input name="simpleSearchSortedRequest"> <wsdlsoap:body use="literal"/> </wsdl:input> <wsdl:output name="simpleSearchSortedResponse"> <wsdlsoap:body use="literal"/> </wsdl:output> </wsdl:operation> </wsdl:binding> <wsdl:service name="SearchWSService"> <wsdl:documentation xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"> Search web service for the Summa system </wsdl:documentation> <wsdl:port binding="impl:SearchWSSoapBinding" name="SearchWS"> <wsdlsoap:address location="http://localhost:8080/summa-web-search/services/SearchWS"/> </wsdl:port> </wsdl:service> </wsdl:definitions> |
|
| Line 55: | Line 117: |
Result will be of the form |
The result string is actually XML, in the following form: |
| Line 60: | Line 120: |
| <?xml version="1.0" encoding="UTF-8"?> <searchresult filter="..." query="..." startIndex="..." maxRecords="..." sortKey="..." reverseSort="..." fields="..." searchTime="..." hitCount="..."> <record score="..." sortValue="..."> <field name="recordID">...</field> <field name="shortformat">...</field> </record> ... </searchresult> |
<responsecollection> <response name="DocumentResponse"> <documentresult query="Hans" startIndex="0" maxRecords="20" sortKey="summa-score" reverseSort="false" fields="recordID, shortformat" searchTime="105" hitCount="1"> <record score="0.37572" id="0" source="NA"> <field name="recordID">fagref:hj@example.com</field> <field name="shortformat">&lt;shortrecord&gt; &lt;rdf:RDF xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#"> &lt;rdf:Description&gt; &lt;dc:title&gt;Fagekspert i Datalogi&lt;/dc:title&gt; &lt;dc:creator&gt;Hans Jensen&lt;/dc:creator&gt; &lt;dc:type xml:lang=&quot;da&quot;&gt;person&lt;/dc:type&gt; &lt;dc:type xml:lang=&quot;en&quot;&gt;person&lt;/dc:type&gt; &lt;dc:identifier&gt;hj@example.com&lt;/dc:identifier&gt; &lt;/rdf:Description&gt; &lt;/rdf:RDF&gt; &lt;/shortrecord&gt;</field> </record> </documentresult> </response> </responsecollection> |
| Line 72: | Line 146: |
Currently, we don't have a schema for the result set. The result can be read as follows: documentresult element * Attributes query, startIndex, maxRecords, sortKey, reverseSort: Same as input to method * Attribute fields: Always "recordID, shortformat" * Attribute searchTime: Time it took to search * Attribute hitCount: Number of results record element * Attribute score: value from 0 to 1 with relevancy ranking * Attributes id, name: Not used field element * Attribute name: Always either recordID or shortformat * Contents are the PID for recordID, or XML for shortformat. The format of the "shortformat" field is (decoded version of contents above): {{{ <shortrecord> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <dc:title>Fagekspert i Datalogi</dc:title> <dc:creator>Hans Jensen</dc:creator> <dc:type xml:lang="da">person</dc:type> <dc:type xml:lang="en">person</dc:type> <dc:identifier>hj@example.com</dc:identifier> </rdf:Description> </rdf:RDF> </shortrecord> }}} Important elements are the "dc" fields. They will contain the actual results. |
Search API
Searching Fedora is possible, using a simple method. Simply give a string, and get results back encoded in XML
WSDL:
<?xml version="1.0" encoding="UTF-8"?>
<wsdl:definitions targetNamespace="http://statsbiblioteket.dk/summa/search" xmlns:apachesoap="http://xml.apache.org/xml-soap" xmlns:impl="http://statsbiblioteket.dk/summa/search" xmlns:intf="http://statsbiblioteket.dk/summa/search" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:wsdlsoap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<!--WSDL created by Apache Axis version: 1.4
Built on Apr 22, 2006 (06:55:48 PDT)-->
<wsdl:types>
<schema elementFormDefault="qualified" targetNamespace="http://statsbiblioteket.dk/summa/search" xmlns="http://www.w3.org/2001/XMLSchema">
<element name="simpleSearch">
<complexType>
<sequence>
<element name="query" type="xsd:string"/>
<element name="numberOfRecords" type="xsd:int"/>
<element name="startIndex" type="xsd:int"/>
</sequence>
</complexType>
</element>
<element name="simpleSearchResponse">
<complexType>
<sequence>
<element name="simpleSearchReturn" type="xsd:string"/>
</sequence>
</complexType>
</element>
<element name="simpleSearchSorted">
<complexType>
<sequence>
<element name="query" type="xsd:string"/>
<element name="numberOfRecords" type="xsd:int"/>
<element name="startIndex" type="xsd:int"/>
<element name="sortKey" type="xsd:string"/>
<element name="reverse" type="xsd:boolean"/>
</sequence>
</complexType>
</element>
<element name="simpleSearchSortedResponse">
<complexType>
<sequence>
<element name="simpleSearchSortedReturn" type="xsd:string"/>
</sequence>
</complexType>
</element>
</schema>
</wsdl:types>
<wsdl:message name="simpleSearchSortedRequest">
<wsdl:part element="impl:simpleSearchSorted" name="parameters"/>
</wsdl:message>
<wsdl:message name="simpleSearchRequest">
<wsdl:part element="impl:simpleSearch" name="parameters"/>
</wsdl:message>
<wsdl:message name="simpleSearchSortedResponse">
<wsdl:part element="impl:simpleSearchSortedResponse" name="parameters"/>
</wsdl:message>
<wsdl:message name="simpleSearchResponse">
<wsdl:part element="impl:simpleSearchResponse" name="parameters"/>
</wsdl:message>
<wsdl:portType name="SearchWS">
<wsdl:operation name="simpleSearch">
<wsdl:documentation xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/">
This method executes the given query and returns a search result ranked by relevancy.
query: The query string.
numberOfRecords: The maximum number of records returned in the search result.
startIndex: The number of the first record to return.
</wsdl:documentation>
<wsdl:input message="impl:simpleSearchRequest" name="simpleSearchRequest"/>
<wsdl:output message="impl:simpleSearchResponse" name="simpleSearchResponse"/>
</wsdl:operation>
<wsdl:operation name="simpleSearchSorted">
<wsdl:documentation xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/">
This method executes the given query and returns a search result ranked by the given sort key.
query: The query string.
numberOfRecords: The maximum number of records returned in the search result.
startIndex: The number of the first record to return.
sortKey: The key to sort by.
reverse: A boolean indication whether or not to sort in reverse.
</wsdl:documentation>
<wsdl:input message="impl:simpleSearchSortedRequest" name="simpleSearchSortedRequest"/>
<wsdl:output message="impl:simpleSearchSortedResponse" name="simpleSearchSortedResponse"/>
</wsdl:operation>
</wsdl:portType>
<wsdl:binding name="SearchWSSoapBinding" type="impl:SearchWS">
<wsdlsoap:binding style="document" transport="http://schemas.xmlsoap.org/soap/http"/>
<wsdl:operation name="simpleSearch">
<wsdlsoap:operation soapAction=""/>
<wsdl:input name="simpleSearchRequest">
<wsdlsoap:body use="literal"/>
</wsdl:input>
<wsdl:output name="simpleSearchResponse">
<wsdlsoap:body use="literal"/>
</wsdl:output>
</wsdl:operation>
<wsdl:operation name="simpleSearchSorted">
<wsdlsoap:operation soapAction=""/>
<wsdl:input name="simpleSearchSortedRequest">
<wsdlsoap:body use="literal"/>
</wsdl:input>
<wsdl:output name="simpleSearchSortedResponse">
<wsdlsoap:body use="literal"/>
</wsdl:output>
</wsdl:operation>
</wsdl:binding>
<wsdl:service name="SearchWSService">
<wsdl:documentation xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/">
Search web service for the Summa system </wsdl:documentation>
<wsdl:port binding="impl:SearchWSSoapBinding" name="SearchWS">
<wsdlsoap:address location="http://localhost:8080/summa-web-search/services/SearchWS"/>
</wsdl:port>
</wsdl:service>
</wsdl:definitions>The result string is actually XML, in the following form:
<responsecollection>
<response name="DocumentResponse">
<documentresult query="Hans" startIndex="0" maxRecords="20"
sortKey="summa-score" reverseSort="false" fields="recordID, shortformat"
searchTime="105" hitCount="1">
<record score="0.37572" id="0" source="NA">
<field name="recordID">fagref:hj@example.com</field>
<field name="shortformat">&lt;shortrecord&gt;
&lt;rdf:RDF
xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#">
&lt;rdf:Description&gt;
&lt;dc:title&gt;Fagekspert i Datalogi&lt;/dc:title&gt;
&lt;dc:creator&gt;Hans Jensen&lt;/dc:creator&gt;
&lt;dc:type
xml:lang=&quot;da&quot;&gt;person&lt;/dc:type&gt;
&lt;dc:type
xml:lang=&quot;en&quot;&gt;person&lt;/dc:type&gt;
&lt;dc:identifier&gt;hj@example.com&lt;/dc:identifier&gt;
&lt;/rdf:Description&gt;
&lt;/rdf:RDF&gt;
&lt;/shortrecord&gt;</field>
</record>
</documentresult>
</response>
</responsecollection>Currently, we don't have a schema for the result set.
The result can be read as follows:
documentresult element
- Attributes query, startIndex, maxRecords, sortKey, reverseSort: Same as input to method
- Attribute fields: Always "recordID, shortformat"
- Attribute searchTime: Time it took to search
- Attribute hitCount: Number of results
record element
- Attribute score: value from 0 to 1 with relevancy ranking
- Attributes id, name: Not used
field element
- Attribute name: Always either recordID or shortformat
- Contents are the PID for recordID, or XML for shortformat.
The format of the "shortformat" field is (decoded version of contents above):
<shortrecord> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <dc:title>Fagekspert i Datalogi</dc:title> <dc:creator>Hans Jensen</dc:creator> <dc:type xml:lang="da">person</dc:type> <dc:type xml:lang="en">person</dc:type> <dc:identifier>hj@example.com</dc:identifier> </rdf:Description> </rdf:RDF> </shortrecord>
Important elements are the "dc" fields. They will contain the actual results.