Tuesday, October 25, 2011

Remediation Strategies for Sensitive Data Discovery

In the past few years, sensitive data discovery - the ability to find sensitive data in structured and unstructured data stores (not legal eDiscovery!) - has become more advanced and more accessible.  Data Leak Prevention suites - the bulk of which have been snapped up by the big boys (Symantec, McAfee/Intel, RSA etc) - have extended their functionality to find sensitive data on servers.  The ability to find sensitive data is one thing - but without a solid remediation plan, many organizations fear that while sensitive data discovery is the right thing from a security perspective, their legal risk can increase if they know where it is but haven't protected it.

So now finding the data is a possibility, but what do you do with it once you've found it?  The fear of an large number of false positives, lack of context on how the data is actually being used, how to protect the data and how to prioritize the whole process if large amounts of data are discovered can be overwhelming.   I've been doing some research in this area and while there is a lot more underneath, here are some of the major findings of how to handle it.

Know that sensitive data discovery is a discrete event and use it to control chaos:  Sensitive data discovery is typically a scheduled process.  This allows you to take bite size chunks out of the problem.  Since it is a discrete event and not a continuous process, you can perform a scan on a limited set of servers, remediate the issues that were found, and then move on to the next group of servers.  Of course, scanning for sensitive data, while a discrete event, should never be a one time event.  Periodic scans of segments should be run at regular intervals.  That sensitive data creep can just keep on going!

Prioritize based on risk:  Sensitive data discovery tools will typically tell you the type, logical location, and amount of sensitive data found but that doesn't complete the risk picture.  If you find or expect to find a lot of it, remediating the risk to that data can look overwhelming.  Creating a risk-based procedure will go a long way to an organized remediation process.  This should depend on your environment, the types of data you collect and your risk and regulatory priorities.  This could mean prioritizing based on physical location (e.g. branch offices with weak physical security), on a specific regulated or high risk data type (e.g. credit card numbers), amount of sensitive data or based on the type of server (file shares prioritized over databases or vice versa).  If there are a number of factors that effect the priority, consider developing a scoring algorithm relative to your business.

Create a remediation plan:  A lot of companies can get stuck here because there isn't a uniform method to remediate the data you will find.  Some sensitive data files will require monitoring, while some will need encryption and access control, some will need to be removed and/or quarantined and some will require review to understand if the access is appropriate.   Understand the business needs of the servers before you scan them to ensure critical business processes aren't interrupted and know what types of remediation would be appropriate.  For example, if you find sensitive customer data in a file share that the business uses on a regular basis, monitoring access and creating management reports to ensure the access is appropriate is a good place to start in slowly locking it down based on business need.  If sensitive data is found in a physically insecure location - encrypting the information to protect it from physical theft is important.

If you run into too many false positives - take a step back: The types of data you are looking for may be too broad or the keywords may need to be tuned.  If trying to find all sensitive data is creating so many false positives that you can't pinpoint the real issue, its not going to do you any good.

In summary, while sensitive data discovery combined with remediation isn't an easy cut and dry process for a complex business, its definitely achievable if its done in an organized and prioritized approach.