QueryFilterTransformer Design

Problem Statement

Currently the CSW Endpoint supports ingesting data via multiple schemas/typeNames. The query definition supports adding a typeNames attribute to the query, but this is not supported by the service and the default attempts to convert data using csw:Record or urn:ddf:metacard taxonomy attributes.

The CSWEndpoint should support queries in other formats so external systems can query data in a similar way to how data can be ingested or transformed for display.


Design Approach

Create a new QueryFilterTransformer interface as shown below.

Instances of the QueryFilterTransformer interface will be created for each typeName supported.

QueryFilterTransformer instances will be registered as services. ID will be set to the same name as used by InputTransformers:

<entry key="id" value="gmd:MD_Metadata"/>

The CSW Endpoint (and other endpoints as required) will be modified to get the QueryFilterTransformer by id using the typeNames. Endpoint will call existing filterBuilder logic to build up a QueryRequest from the input criteria. The Endpoint will then call the QueryFilterTransformer to transform from external attribute names/values to taxonomy names/values. 


QueryFilterTransformer
QueryRequest transform(QueryRequest)

Transforms ddf.catalog.operation.QueryRequest containing external attribute names & values into a normalized taxonomy QueryRequest. QueryRequest is used in order to allow the transformer to directly modify other properties (e.g. sourceId) based on criteria without requiring the endpoint to modify the request after conversion.

Transform method will use the FilterVisitor pattern and build the result Filter as the query is parsed.


Multiple typeNames

The CSW 2.0.2 specification describes typeNames as the following:

The typeNames parameter is a list of one or more names of queryable entities in the
catalogue's information model that may be constrained in the predicate of the query. In
the case of XML realization of the OGC core metadata properties (Subclause 10.2.5), the
element csw:Record is the only queryable entity. Other information models may include
more than one queryable component. For example, queryable components for the XML
realization of the ebRIM include rim:Service, rim:ExtrinsicObject and rim:Association.
In such cases the application profile shall describe how multiple typeNames values
should be processed.
In addition, all or some of the these queryable entity names may be specified in the query
to define which metadata record elements the query should present in the response to the
GetRecords operation.
NOTE The typeNames parameter is different from the Type core queryable metadata property defined in
Subclause 6.3. The typeNames parameter is composed of one or more names of queryable entities in the information
model of the catalogue. The core queryable Type is used to indicate the type or class of a resource being described by
the catalogue. Typically the value of the Type property is taken from some controlled vocabulary.


Is typeNames the correct field to use to specify query format type? yes

If typeNames is used to determine the query format, should more than 1 be supported? no, throw UnsupportedQueryException in this case

If more than one is present, what is the expected behavior? throw UnsupportedQueryException


Reference Documents