Which Domains are Good?

I’ve occasionally done research on domain names.  The goal of a lot of that research is to determine a method that will determine if a domain is malicious or not.  It’s a good goal, we’d like to be able to look at domains and say ‘bad, good, bad, good, good, good’ and perhaps block users or at least label traffic.

If we could stop our users from going to the malicious domains we could probably stop a lot of problems.  

I’m going to sound a little philosophical here.  If we can label something as ‘bad’, we must be able to compare it to something that isn’t bad.  In other words, bad can’t exist without good.

Which leads me to my question:  How do I create a set of good domains?  And what do I mean by ‘good’? Do I mean domains that have never hosted malware?  Or domains that have never been used maliciously?  

I need to be able to define what I mean by ‘good’ before I can define that list of domains.  Once I decide that, I have another problem.

Any domain can be hijacked and misused.  A good security team will do their best to stop that, but if defense is your only play, the other side only has to get it right once.  Given a set of domains, I can’t say for sure that none of them have been used for malicious ends. 

Which takes me back to my original question:  How do you create a set of good domains? 

This is an important step in domain research that should be tackled.  Creating a set of ground truth domains allows us to say ‘these aren’t those’, in other words, find the malicious domains.

Share