Free Advice to the NSA: How To Pursue Terrorists And Protect Civil Liberties

Phone Link Analysis

 
The latest revelation that the National Security Agency has gathered phone records of millions of ordinary Americans has generated outrage and controversy across the political spectrum. The NSA has gathered phone records apparently without court orders in violation of existing statutes. It appears that the NSA is attempting to use this vast database of phone records to connect the dots between known terrorists by using software to look for links and patterns in the records. Unfortunately, the fact that the phone records contain the phone numbers of millions of ordinary and innocent Americans opens the door to abuse of the database and guilt by association.
 
The NSA is likely using link analysis techniques in an attempt to connect known targets separated by multiple degrees of separation. Link analysis is a simple yet powerful tool that can be used very effectively on structured relational data. Link analysis is nothing but the high tech equivalent of the "Kevin Bacon Game".
 
The image above [click image for a larger image] shows an example of how NSA would connect Bad Guy #1 with Bad Guy #2. To do so, NSA would need the phone records of Bad Guy #1, Person A, Person D, Person G and Bad Guy #2. By traversing the phone record tree from both directions the NSA could connect Bad Guy #1 and Bad Guy #2 by finding that they both are connected to intermediate Persons A, D or G.
 
In order for the NSA to do link analysis with a court order, the NSA would have to first get a warrant for the phone records of Bad Guy #1. It would then have to get a warrant for phone records for each person on Bad Guy #1’s phone record (i.e., persons A and B) and then get warrants for the persons on the phone records of the next set of people and so on. At some point, the NSA would have a difficult case to make that one of these intervening people was legitimately connected to an ongoing investigation. Even if it succeeded in making the case for the warrant, the logistics of getting a warrant at every step of the process would make this kind of link analysis cumbersome and nearly impossible to perform in real time. I suspect that is why the NSA and the President decided to go around the law. When faced with a question of law, instead of asking Congress to update the law, the Government chose to ignore the law.
 
The problem in this approach for the NSA was that getting the phone records of intervening persons between two known bad guys requires court orders. There is perhaps a simple way to achieve the goals of the NSA without the court orders and the violations of privacy that results if the court orders are not sought. I propose that instead of seeking the actual phone numbers from the phone companies, the NSA should seek secure hashed equivalents of the phone numbers. That is, all phone records handed over to the NSA should contain secure hashed ids instead of the actual phone numbers of American citizens. The phone company would keep the actual phone records and the mappings between the phone numbers and their hashed equivalents. This will ensure that the NSA does not have a database of phone numbers of ordinary Americans. I also believe there is no law that would be violated by the phone companies turning over this data to the NSA.
 
Briefly, secure hashing is a technique that is commonly used to store passwords and to digitally sign electronic messages. The power of secure hashing lies in that when a number or string is hashed to produce a message digest, there is no way to get back to the original number or string. However, the same number, if secure hashed repeatedly will result in the same message digest. This feature allows one to store data, a password or phone record for example, in a database without the original password or phone record being compromised. Given the original phone number or password, one can secure hash it and then compare it to data in the database to find its matching hash. SHA-1, the most commonly used secure hashing algorithm was designed by none other than the National Security Agency.
 
This new database maintained at the NSA, using secure hashed ids in lieu of phone numbers, would be just as effective for data mining and link analysis. If the NSA knows the phone number(s) of a known target or targets, they can simply convert the phone number to its secure hashed equivalent (or "message digest" ). These message digests then can be used to perform link analysis on the database. Using the example in the image, the NSA would secure hash the phone number of Bad Guy #1 and look up the phone record equivalents in the database. They would find the hashed message digests representing Persons A and B. When they look up the records for the message digest of person A, they would similarly find the message digest of Person D. Similarly, coming from the other side, the NSA would secure hash the actual number for Bad Guy #2 and find the message digest of Person G. In looking at the records of Person G, the NSA would find the message digest of Person D. Then, Voila!, the NSA will have connected Bad Guy #1 to Bad Guy #2 without knowing the phone numbers of Persons A, D and G. Armed with the message digests of Persons A, D and G, the NSA can now approach the court for a warrant based on probable cause. The phone companies can then provide the NSA with the actual numbers and identities of Persons A, D and G by mapping the message digests to their original phone numbers that the companies would keep in their own databases. The phone records of all other persons not involved between Bad Guy #1 and Bad Guy #2 will remain unknown to the NSA.
 
This simple use of existing cryptography techniques may eliminate the need for the massive intrusion into the privacy of ordinary Americans that is currently occurring. This solution allows the NSA to troll and mine to their hearts content in an attempt to keep us safe without violating our hard earned civil liberties. Who knows, with any luck it will come to light that the NSA is already doing this and all this fuss will have been about nothing. However, the fact that Qwest balked at handing over phone records to the NSA suggests to me that the NSA is not using this simple but effective technique.
This entry was posted in Constitution, Politics. Bookmark the permalink.

10 Responses to Free Advice to the NSA: How To Pursue Terrorists And Protect Civil Liberties

  1. Ingrid says:

    Wow, I did not know about this. Every time I end up here (which is practically everyday when I do my blogger rounds) I learn something new if not something more detailed.
    So Mash, do you think you could email this to the NSA people? someone might remind them that it’s better pr…well in that case, cc the White House too..
    Ingrid

  2. Pingback: Or How I Learned to Stop Worrying » The Effects Of Drinking Too Much Kool-Aid

  3. Mash says:

    Ingrid, thanks, I’m thinking the NSA has already read this by now :)]

  4. John says:

    Well, I guess one needn’t try to figure out how you got your name! You definitely MASHED the administration! Two great articles in two days. I have bookmarked your blog! Thanks for the post, albeit disturbing.

  5. dont underestimate the NSA says:

    Mash,
    Using a cryptographic hash function to allow the NSA to anonymously mine call trees does not really solve the problem. The space of all the valid phone numbers in the USA is very small (order of few billions). It will take the NSA less than a few hours to build up a complete table mapping all phone numbers in the USA to their hash values. The hashphone number mapping is avaiable to anyone, not just the telcos.
    A better approach would be for the telcos to encrypt the phone numbers with a secret key, rather than hash it. Of course, it wont take much to force the telcos to han over this secret key to the NSA.
    Cryptography cannot help if the telcos are willing to sacrifice their customer’s privacy at the altar of government contracts.

  6. Mash says:

    The secure hash was a baseline to start from. It of course can be brute forced as you state. Someone on DKos suggested much like you that the telcos should keep the numbers and a mapped GUID corresponding to each number. That way, every time the NSA needed a number mapped, they go the the telcos through the court. This makes some sense and avoids the brute force problem because the full space of GUIDS could make brute forcing insurmountable. The problem here is that all the telcos would have to jointly set up a database of maps. Since each telco would need to know what a phone number in another telco maps to. This however could be easily overcome by Congress mandating it.

    Even though the secure hash set can be fully mapped, I still think it is a plausible middle ground. This way the NSA does not have the phone numbers and would have to actively and willfully violate the law by creating the maps. This leaves no ambiguity if the law is broken. That was the appeal of the secure hash to me. Plus it has the added benefit of not having to change existing legislation.

  7. Mash says:

    Oh, almost forgot to mention some nifty ideas about hashing. You could make the secure hashes that much stronger and much more difficult (nearly impossible) to brute force by hashing not only the number but also the phone owner’s name and perhaps account number together into a hash.

    This makes it very very difficult to brute force since you now are not limited by the total number of phone numbers in creating your maps.

    Come to think of it, I like this idea a lot \:d/

  8. Pingback: graphpaper.com - NSA Data Mining 4: Total Information Awareness, Resurrected

  9. Pingback: Or How I Learned to Stop Worrying » General Hayden Moves The Line And Spends A Bundle

Comments are closed.