Security analysts are continuously faced with the challenge of having to make decisions about prioritizing and investigating indicators in the face of incomplete information. Limited resources and increasing number of events to triage make this task even more difficult.
In this blog we describe the indicator rating model TruSTAR is introducing to assist security analysts in their investigation workflow. We fully realize that there never will be a perfect scoring or rating model, and we are not claiming to have created the proverbial perpetual motion machine. All scoring and rating models have their pitfalls - definitely read this blog if you want to learn more about our views on how to avoid them. Our intent is to assist overworked analysts with evidence-based insights from our platform that help them make educated decisions. Our rating does not represent absolute risk posed by this indicator at a global level - users still need to rationalize this in context of their own infrastructure and existing safeguards they have in place.
The inputs to our model are a mix of automated and human assessments (whitelisted indicators, indicator categorization). While every attempt is made to limit individual human biases from affecting the rating model, we do accept that different analysts will have different perspectives and this could have an impact on generalizability of the eventual rating. We have incorporated Machine Learning (ML) models into our overall rating model to lessen the impact of individual analysis. Read on to learn more about the specifics of our rating model.
How is the Rating Derived
At a high level our rating model captures the insights available from the following:
1. Observed maliciousness
We have developed Machine Learning (ML) models trained to assess maliciousness for indicators based on historical observations. We use lexical features, features from intelligence sources and analysis provided by TruSTAR users to determine the maliciousness of the indicator. See below table for a full list of features.
|Indicator Type||Lexical Features||External Intel Sources||User Features|
|URL||Length, numbers to letter ratio etc.||Historical behavior of the url provided by 3rd party sources etc.||Whitelist|
|IPV4, IPV6||Length, octets analysis etc.||IP reputation, activity associated with IP, location etc.||Whitelist|
|MD5, SHA1, SHA256||N/A||Suspiciousness of files, features from dynamic and static analysis of files.||Whitelist|
|Email Addresses||Length of the username, numbers to letter ratio etc.||3rd party sources providing information about the domain name.||Whitelist|
|Bitcoin Addresses||N/A||Transaction activity information and amount of BTC available in the wallet.||Whitelist|
2. Total sightings among peers on TruSTAR
Historical sightings of an indicator within the TruSTAR platform. This makes the sighting count more relevant as it relies on the frequency of daily sightings among peers. The user can obtain the sighting information without publically leaking the fact that the indicators, associated with an attacker’s infrastructure, are being investigated.
3. Timeliness of sightings on TruSTAR
Timeliness to reflect if indicators have become stale. It is a waste of time analyzing indicators that have not been active for a while. Taking action on stale indicators is ineffective and can actually lead to negative repercussions. The number of daily sightings is weighted down by the the indicator’s age.
For example, “hxxp://jacobswees[.]sa[.]tn/new1[.]php” is a URL that was categorized as a HIGH PRIORITY indicator.
Our model arrived at this decision using the following rationale: Firstly our ML model rated this as an indicator with high level of maliciousness based on the combination of lexical features such as length of URL, information about the TLD and features from 3rd party intelligence sources.
Secondly this indicator had a number of historical sightings, but more importantly it was sighted in two separate reports in last 48 hours. The predicted maliciousness combined with the recent sightings led this indicator to be rated as a HIGH PRIORITY indicator. This should not be interpreted as a level of riskiness, as that is determined by further contextualizing the impact of this indicator against your own security posture and controls.
We have designed the model to provide a consistent way of making evidence based decisions. We also want to be transparent about our methodology so users can objectively determine if the model is positively impacting their analysis and triage workflow.
We will be making updates to the model when we see room for improvement and look forward to your feedback. Don’t hesitate to contact us for a deeper discussion on the model.