Child pages
  • Person Directory 2.0 Design Notes
Skip to end of metadata
Go to start of metadata

Here are my notes from some internal design work on PD2.0, the primary goals are:

  • Simplify configuration, this will likely involved a custom Spring namespace handler to provide a more complete XML configuration language.
  • Improve lookup speed, adding in an ExecutorService to allow for parallel lookup of attributes from various sources.
  • Simplify the API, provide a try criteria API for complex searches in addition to the ability to lookup attributes for a single user.

Secondary goals:

  • Add JMX monitoring of performance of each attribute source.

 

QUESTIONS

  1. are attribute names case insensitive? YES according to PD1.5 behavior


api - public interface

Complex queries and multiple attribute sources

  • default root query object ORs its parts together?
  • break root query object up by OR clause?
  • maxResults?
  • the problem:
    • Given a query like (firstName=Jane && (isStudent=Y || lastName=Doe))
    • How do we handle sources that do not support all of the attributes in the query?
      • do a multi pass query, query sources that support all attributes first
      • query sources that support a subset of the attributes second, during merge filter these in code using the attributes that were not passed to the source
      • query non-searchable sources

General Query Logic

  • attribute query
    • ex: by username, [foo=bar, name=smith, ....]
    • Run MS & PS sources
      • turn map into OR() criteria for MS
    • Run S sources once per existing result
  • criteria query
    • ex: (firstName=jane && (lastName==smith || lastName=doe))
    • Run MS sources
      • merge results
    • Run PS sources
      • merge results
    • Run S sources once per existing result

attribute source classes - how do we tell/config the difference?

  • fully searchable (MS) - CriteriaSearchAttributeSource
    • uses a query template (supports arbitrary logic)
    • ldap or primary use directories go here
  • partial searchable (PS) - SimpleSearchAttributeSource
    • uses named placeholders but still can return multiple people for one query
    • small associated sources go here
  • single-person only (S)
    • will only ever return a single result ... is this useful?
    • in=memory sources like for shib go here


spi - what code in support implements to provide data


core - big ugly guts

  • core code that does
    • dependency tree calc of sources
    • determine query order and potential for parallelism, probably better to figure it with always parallel and having "block" spots that wait for other sources to complete
    • caching of results from each source
    • handling of query timeouts
    • merging results from various sources
    • mapping attribute names from the API side to the SPI side
      • TODO move this into a transformation API
    • jmx metrics for per-source usage & performance
    • primaryId
      • Used when a find person by primary id query is run
      • Used to merge data from multiple sources (each result must have a primaryId set)
  • add a list of AttributeSourceFilter
    • these are called in order (sorted by ordered)
    • if any filter returns false the filtered source is not executed
    • filterchain style API that allows for modification of search?
  • dependency tree calculation on configured attribute sources
    • needs to fail to init if something is wrong with the tree
    • this probably needs to be calculated and cached for each query since the tree will look different every time based on the input
  • caching of results - part of XML config support
    • for each configured source, set cache name or reference to Ehcache bean
    • optional cache name/ref for misses
    • optional cache name/ref for exceptions
  • query timeout - part of XML config support
    • set maximum wait for query result
    • set behavior on timeout? (ignore, fail)
  • merge behavior
    • if two sources return different attribute values for the same attribute ignore the second and log an error
  • attribute name mapping - part of XML config support
    • for each configured source, option to allow for saying api attr "username" is actually "uid" in this spi
    • which direction does this mapping work?
  • attribute lists
    • in the config are these the PD side or the source side of the attr mapping?
    • at least one required or optional search attribute must be specified
    • required search
      • ALL of these attributes must be include in a query for this source to be able to run the query 
    • optional search
      • This plus the required set make up the collection of attributes that can be used to search, attributes outside this set are ignored
    • available return 
      • The list of attributes the source returns, this is a best-effort set and the source may return more attributes than are named in the set

support

  • attribute sources
    • jdbc (MS,MP,S)
      • single row
      • multi row
    • ldap (MS,MP,S)
    • xml (MS)
    • request attribute (S)
  • filters
    • regex
    • spel

 

 

  • No labels

2 Comments

  1. This page appears to be blank.  Is there something already written that can be put here or attached?

    1. I'll post my design notes in the next few days and get some code into github as well.