• Oxford: +44 (0)1865 877830 
  • Manchester: +44 (0)161 713 0176 
  • Edinburgh: +44 (0)131 541 0118 
  • New York: +1 646-781-7580 
  • Bucharest: +40 316 301 707 
  • Tokyo: +81 (3) 4588 8181 

Verifying PCI DSS Scope: Hunting for Credit Card Numbers

You are here



Verifying PCI DSS Scope: Hunting for Credit Card Numbers

PCI DSS requires that the scope of assessment must be checked to make sure the scope is accurate. This check must also be carried out every year. Even if the documented scope means that no cardholder data is stored, there still may be some cardholder details that have been inadvertently left in documents. These credit card details may either be left over from activities prior to working towards PCI DSS compliance, or it may be that company credit card procedures have been breached. There are some good tools out there that search for cardholder data on PCs and networks, however I was looking for something that could easily run standalone on individual PCs and search a hard drive relatively quickly. Dionach developed PANhunt, which does exactly that. It is a Python script that is easily converted to a standalone executable, which can then be run off a USB stick. PANhunt uses simple regular expressions to look for Visa, MasterCard and American Express card numbers in document and email files such as Word documents, Excel spread sheets, TXT files, XML and PST files. PANhunt also searches ZIP files recursively. PANhunt will create a report listing masked PANs found. Some system files do generate false positives, but Windows system folders are excluded by default. The current release will not search Access databases, but will list where they are located. The scripts and instructions can be found at https://github.com/Dionach/PANhunt. Technically, searching across a C:\ drive for files with a certain extension is straightforward in Python. PANhunt treats documents as text files, or in the case of DOCX and XLSX as ZIP files. Text files can be easily searched using regular expressions to match the different credit card types. The PST format was more challenging. Microsoft has published the PST file format as an open specification here: http://msdn.microsoft.com/en-us/library/ff385210. There are some code libraries out there in different languages such as Java, and C#, however they didn't provide everything needed, weren't in Python, or just didn't work. So, I developed pst.py to parse PST files and so provide access to emails and attachments contained in them. The script supports both ANSI or Unicode PST formats. A few interesting things I learnt about PST files:

  • The published specification is wrong in at least two places about the ANSI format.
  • Setting a PST password does not encrypt anything, it just sets a password property in the PST file. This can be ignored.
  • Recent Microsoft Outlook client versions seem to encode PST data sections by default, so you can't easily see email text in the raw PST file. The decoding algorithm looks complicated in the specification but reduces to quite a simple substitution algorithm.
  • Hopefully PANhunt will be useful for people who want to easily check if there are any credit card numbers stored on local PCs.

Posted by Bil

4 Comments - Verifying PCI DSS Scope: Hunting for Credit Card Numbers

Stevie (not verified) March 24, 2014

Hi Bil, Nice thoughts, very useful to me. Will try out PANhunt. I'm trying to create a PowerShell script for the same task, and will share it if/when completed. How about this scenario on scope; would you agree or disagree with the following logic? Premise: PCI-DSS requirements must apply to all PAN-handling systems, and any systems supporting their segregation (IF SEGREGATION IS EMPLOYED TO LIMIT SCOPE; NOT a requirement). By implication: When network segregation is not in place, all systems are in scope - and more importantly - When network segregation is used, all systems which can still access segregated CDE systems are also in scope With me so far? So, if: System A is handling PAN data, and is behind a network firewall AND on a specific network subnet with similar hosts System B is allowed to connect to System A, through the firewall, and is on a remote subnet containing hosts which are NOT allowed through the firewall System X is on the same network subnet as System B, with no network filtering between System B & System X I conclude that System X is "in scope" as a result of this. Would love to hear experienced thought on this - thanks for reading this far!

Bil March 27, 2014

Hi Stevie, In principle, system X could be in scope, depending on whether or not it influences the cardholder data environment. For example, is X a completely standalone host, or is X the domain controller for B? It is in theory isolated itself from the cardholder data environment (CDE), but as you can see, it depends what X does. It may be easier to consider that if X can be easily isolated from B, then just isolate it, and if it can't, well it probably needs to be in scope as it will have some trust relationship with B. A useful tool when thinking about scoping is the PCI Scoping Toolkit (http://itrevolution.com/pci-scoping-toolkit/). This toolkit is not endorsed in any way by PCI SSC, and it does not take into account whether one system may influence another with access to the CDE, but I have found it useful to break down network segregation and isolation.

greystoke (not verified) April 06, 2014

Hi Bil, is there any chance the windows executable has been created and available to download?, I don't have any familiarity with python, and my attempts to set up the environment have been fruitless so far. Thanks

Bil April 10, 2014

Hi, I've added panhunt.exe to the github repository at https://github.com/Dionach/PANhunt for you. I've also added .msg file support.

Leave a comment