Advanced Data Filters

What are Advanced Data Filters?

Advanced Data Filters are the JSON filters used in Advanced Configuration to choose what data should be included in reports. They're also used in Data Permissions to choose what data a user can see.

Who can use this feature?
 User Types
Any user with access to Explore (Global Admins, Area Admins, and some Users).
 Pricing 
Available on paid plans (AnalystCLO, and Enterprise).
 Expertise
Advanced Data Filters are designed to be used by expert users who understand JSON and the properties of xAPI statements.

Getting Started with Advanced Data Filters 

The easiest way to get started with Advanced Data Filters is to create a report using Simple Configuration in Explore and then switching to Advanced Configuration to see how the JSON has been created and formatted.

In Simple Configuration, use the main filters (Activities, Verbs, Dates, and People) to create a report. These filters chosen are used to populate the filter property of the report configuration object in Advanced Configuration. For example, a filter that includes an activity, person and date range might look like this:

"filter": {
  "activityIds": {
    "ids": [
      "https://twitter.com/6209872/status/65472397"
    ],
    "regExp": false
  },
  "personCustomIds": [
    "alberta.dyonnaire@example.com"
  ],
  "groupCustomIds": null,
  "dateFilter": {
    "dateType": "trailing",
    "trailingAmount": "6",
    "trailingType": "months",
    "customDateFrom": null,
    "customDateTo": null
  }
}

Advanced Data Filter Options

There are some additional filters and properties that are not accessible in Simple Configuration.

Filter by child groups

childGroupsOfCustomId will filter all groups that are direct children of the selected group. This is useful where a group has a large number of direct children that you want to compare. For example if an organization contains 200 departments, you can use the childGroupsOfCustomId filter with the name of the organization group that contains those departments. The report will then display all departments within the organization.

"childGroupsOfCustomId": [
  "Company"
]

Filter by group type

The group type filter enables you to filter for groups of a particular type, for such as departments or teams. It filters out any people that do not belong to a group of the configured types, and when the report is organized by group, ensures that only groups of the configured types are shown.

"groupTypeNames": [
  "team"
]

 Please note: group type names displayed in Settings / Your Organization are all plural (e.g. "teams", "departments"). If you are copying the group type name from that page, you need to remove the 's' before using it in this filter.

Filter by group without adding that group to the list of items

When a report is organized by group, the default behavior is to list all groups included in the group filter to the list of items. But what if you want to filter a report by a group without adding that group to the list of items? For example, what if you want to report on data by job role, but only show data for a particular region? In this case, you can use the excludeFromOutput flag. For example:

"groupTypeNames": [
  "Job Role"
],
"and": [
  {
    "groupCustomIds": [
      "Sales"
    ],
    "excludeFromOutput": true
  },
  {
    "groupCustomIds": [
      "Operations"
    ],
    "excludeFromOutput": false
  }
]

This will show every job role, plus group id 456 in the list of items. It will only show data for people who belong to one of those groups and to group id 123. Group 123 will not appear in the list of items.

Filter people by interactions

The people filter enables you to filter a population of people based on their interactions. For example, you might filter all people who have completed a particular course:

"peopleFilter": {
  "activityIds": {
    "ids": ["http://leadership.assessment.example.com"]
  }
}

When used inside a people filter, the not filter has a special behavior of filtering all people who do not match the filter:

"peopleFilter": {
  "not": {
    "activityIds": {
      "ids": ["http://leadership.assessment.example.com"]
    }
  }
}

Filter people by measure values

The measure filter is a special type of people filter that enables you to filter a population of people based on the value of a particular measure. For example, include only people with an average score above a certain value, include only people who logged in less than a certain number of times in the last week, or exclude the highest and/or lowest performers by a given metric to exclude outliers.

A simple measure filter is shown below. In this case, the filter will include only people whose most recent score was 100%.

 Please note: In practice, the measure filter is likely to be more complex and filter only people whose most recent score was 100% in a particular assessment. The filter below will look at all scores from all sources.

"peopleFilter": {
  "measureFilter": {
    "measure": {
      "name": "Last Score",
      "aggregation": {
        "type": "LAST"
      },
      "valueProducer": {
        "type": "STATEMENT_PROPERTY",
        "statementProperty": "result.score.scaled",
        "caseSensitive": true
      }
    },
    "equals": {
      "values": {
        "ids": [1.0]
      }
    }
  }
}

The measure filter contains two properties: measure and one of either equals, range, or percentileRange. The measure property contains configuration for the measure to compare against. The definition of measures is explained in the Measure Editor help guide.

The equals property defines a value or list of values to match against. This uses the same syntax as the equals filter outlined below. See an example of an equals measure filter above.

The range property defines a range of values to compare against. For example, you might want to filter people who scored on average between 50% and 75%. The includeUpper and includeLower properties defines if this is applied inclusively, so in this example a score of 75% exactly will be included.

"peopleFilter": {
  "measureFilter": {
    "measure": {
      "name": "Average Score",
      "aggregation": {
        "type": "AVERAGE"
      },
      "valueProducer": {
        "type": "STATEMENT_PROPERTY",
        "statementProperty": "result.score.scaled",
        "caseSensitive": true
      },
      "id": 103
    },
    "range": [{
      "from": 0.50,
      "to": 0.75,
      "includeUpper": true,
      "includeLower": true
    }]
  }
}

 Please note: the range filter is applied before values are rounded for display, so the range filter may filter out some values that you don’t expect it to. In the example above, an average score of 75.01% might be displayed in a report as 75% but would still be excluded by this measure filter.

The percentileRange property is used to exclude people with the top and bottom values for a given metric. This can be used if you have a few exceptionally high and low performers who are skewing the average results. The config below will exclude the top and bottom 1% in terms of points scored, so if people score between 0 and 10,000 points, this config would include only people who scored between 100 and 9,900. If the config was changed to make includeLower and includeUpper false, it would show people who scored between 101 and 9,899.

"peopleFilter": {
  "includeParentFilter": true,
  "measureFilter": {
    "measure": {
      "name": "Points Scored",
      "aggregation": {
        "type": "SUM"
      },
      "valueProducer": {
        "type": "STATEMENT_PROPERTY",
        "statementProperty": "result.score.raw",
        "caseSensitive": true
      }
    },
    "percentileRange": [
      {
        "from": 1,
        "to": 99,
        "includeUpper": true,
        "includeLower": true
      }
    ]
  }
}

Filter by verb

"verbIds": {
  "ids": [
    "http://id.tincanapi.com/verb/viewed"
  ],
  "regExp": false
}

Filter by context activity

It's also possible to filter by context activities using the 'parentActivityIds', 'groupingActivityIds', and ‘contextActivityIds ‘ properties. For example:

"contextActivityIds": {
  "ids": [
    "https://twitter.com/"
  ],
  "regExp": false
}

 Please note: the contextActivityIds filter property will match activity ids whichever collection of context activities they are found in (parent, grouping, category or other).

Filter by dates in fields other than timestamp

By default, the date filter is applied to the 'timestamp' property that represents when the interaction happened. You can also filter by dates in other statement properties (such as 'stored' and extensions) using the 'fieldName' property in a date filter. 

"dateFilter": {
    "dateType": "trailing",
    "trailingAmount": "6",
    "trailingType": "months",
    "fieldName": "stored"
}

Hint: to filter by multiple date based properties, use an 'and' filter. See below.

Filter by any statement property

In fact, it’s possible to filter by any statement property using the required and equals filter properties.

Use required to filter all statements that have a certain property with any value. For example if you are reporting on score data, you might want to filter only statements that contain a score:

"required": "result.score.scaled"

Use equals to specify that a certain value is required, for example perhaps you filter only accounts with a specific account:

"equals": [{
  "fieldName": "actor.account.homePage",
  "values": {
    "ids": [
      "http://watershedlrs.com"
    ],
    "regExp": false
  }
}]

When equals is used with a numerical value, you should supply a string value that includes a decimal point. For example if you wanted to filter all statements with a raw score of 5, you would use the following syntax:

"equals": [{
  "fieldName": "result.score.raw",
  "values": {
    "ids": [
      "5.0"
    ],
    "regExp": false
  }
}]

When using filters with extensions, you may need some extra syntax to help Watershed understand the structure of the extension. The fieldType property tells Watershed what type of value you are filtering by. Possible values are string, number, boolean, null or array.

For example, consider the following result extension:

"https://example.com/fruits" : [
  "apple",
  "pear",
  "orange", 
  "bear"
]

To filter just statements containing apples, you could use the following filter:

"equals": [{
  "fieldName": "result.extensions.[https://example.com/fruits]",
  "fieldType": "array",
  "values": {
    "ids": [
      "apple"
    ],
    "regExp": false
  }
}]

If your extension includes an array of objects and you want to filter by a property of those objects, you need to tell Watershed about the array using the __arr__ syntax. For example, let's say you have a context extension with the following value:

"https://example.com/shopping-list" : {
  "fruits": [
    {
      "type": "apple",
      "color": "green"
    },
    {
      "type": "apple",
      "color": "red"
    }
  ]
}

To filter just statements containing green fruit, you'd use the following filter:

"equals": [{
  "fieldName": "context.extensions.[https://example.com/shopping-list].fruits.__arr__.color",
  "fieldType": "string",
  "values": {
    "ids": [
      "green"
    ],
    "regExp": false
  }
}]

Regex

Advanced Data Filters support regex to filter matching activities, context activities and verbs.  For example, the following would filter all statements where the activity id started with “https://twitter.com/”

"activityIds": {
  "ids": [
    "https://twitter.com/.*"
  ],
  "regExp": true
}

Another more complex example filters all activity ids starting with "http://example.com/assessments", except those that contain the string "question":

"activityIds": {
  "ids": [
    "(http://example.com/assessments.*)&~(.*question.*)"
  ],
  "regExp": true
}

Not all regex syntax elements are supported by Watershed. The table below lists some of the unsupported syntax elements and alternatives you can use.

Regex

What it does

Alternatives

^

Anchors the expression at the start of the string.

Watershed uses Regex to match the whole string only (not partial strings) so anchors are not required.

$

Anchors the expression at the end of the string.

Watershed uses Regex to match the whole string only (not partial strings) so anchors are not required.

?:

Non-matching group. Used to include expressions relating to a part of the target string that are not intended to be matched.

Watershed uses Regex to match the whole string only (not partial strings) so non-matching groups are not required.

/d

Matches any numerical character.

Use [0-9] instead.

If you are not familiar with regex, please speak to us for help with your activity filter.

 Please note: Regex filters are case sensitive unless specified within the regex otherwise. 

And, or and not

By default, filter properties are added together so that if you include a verb filter and an activity id filter, statements must match both that verb and activity id. On the other hand, lists are interpreted as an or filter, so if you specify a list of verb ids, statements matching any verb on the list are included. And, or and not filters give you the power to change that and, for example, match statements that either use a particular verb or a particular activity id. 

You can combine multiple filter properties using and, or and not to craft very complex filters. The following contrived example includes all three properties nested together.

"or": [
  {
    "verbIds": {
      "ids": [
        "http://id.tincanapi.com/verb/tweeted"
      ],
      "regExp": false
    }
  },
  {
    "and": [
      {
        "contextActivityIds": {
          "ids": [
            "https://twitter.com/"
          ],
          "regExp": false
        }
      },
      {
        "equals": [{
          "fieldName": "object.definition.type",
          "values": {
            "ids": [
              "http://id.tincanapi.com/activitytype/tweet"
            ],
            "regExp": false
          }
        }]
      }
    ],
    "not": {
      "required": "actor.mbox"
    }
  }
]

In the example above, either the verb must be ‘tweeted’ or twitter must be a context activity and the activity type must be a tweet, but the actor must not be identified by email.

 Please note: and and or contain arrays of filter objects, whereas not contains a single filter object.

In the example below, the last 2 months are filtered, then the last 1 month is removed giving only statements from 'last month':

"filter": {
  "dateFilter": {
    "dateType": "trailing",
    "trailingAmount": "2",
    "trailingType": "months"
  },
  "not" :{
    "dateFilter": {
      "dateType": "trailing",
      "trailingAmount": "1",
      "trailingType": "months"
    }
  }
}

And, or and not filters are applied in addition to other properties in the filter. This means that the below filter would include statements where the actor is 'alberta.dyonnaire@example.com' and either the verb is 'completed' or the activity id 'http://example.com'.

"filter": {
  "personCustomIds": [
    "alberta.dyonnaire@example.com"
  ]
  "or": [
    {
      "activityIds": {
        "ids": [
          "http://example.com"
        ],
        "regExp": false
      }
    },
    {
      "verbIds": {
        "ids": [
          "http://adlnet.gov/expapi/verbs/completed"
        ],
        "regExp": false
      }
    }
  ]
}

Using Advanced Data Filters in Data Permissions

The same filters that are used in Advanced Configuration can be used when setting up a Watershed user's Data Permissions. We recommend using Explore to set up the filters first, and then bringing the filters into the user's Data Permissions.

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.

If you can't find what you need or you want to ask a real person a question, please contact customer support.