“Photo DNA” – How reliable is it?

PhotoDNANo one can doubt that sexual crimes against children are among the most horrific imaginable. And given the obsession that lawmakers have with sex crimes and the punishment of anyone associated in any way with them, it’s not surprising that the market has responded with the creation of new technologies designed to help catch the people involved in Internet child pornography. A Swedish company has been quietly amassing a huge database of over 400,000 images and videos of what purports to be child pornography. They don’t clarify how they define this or how they determine if the original photos are of real children or simply digital creations, but they have combined forces with Microsoft’s “Digital Crimes Against Children” group  (did you know that Microsoft had such a group?) and produced something they call “PhotoDNA”, designed to help law enforcement catch the bad guys creating, distributing and viewing this stuff.

The goals are laudable, but because this is a one-side effort (Microsoft makes this available to law enforcement only, and not to the defense), there is no way for an objective observer to determine the answer to many critical questions. Capitalizing on public perceptions that DNA is infallible and scientific, the name alone creates presumptions that cannot be tested by the defense. The technology purports to identify a photograph’s “digital fingerprint” based on a hash algorithm which can then be applied to other suspect images to see if the suspect image is derived from, or identical to, the original image. Here’s what Microsoft says about the process:

“PhotoDNA uses a mathematical technique known as robust hashing that works by calculating a unique signature into a ‘hash’ that represents the essence of a particular photo. In the same way that the characteristics of every person’s DNA are different, the signature or ‘hash value’ for every photo is different, enabling the creation of a hash that can identify an image based on its unique characteristics or its ‘digital DNA.’ Although a photo’s hash cannot be used to re-create an image or identify people or items within an image, it can be compared with hashes of other photos as a reliable way to match two different copies of the same image”.

Several important questions arise. Have scientific validation studies been performed that have produced a reliable error rate for this technology? Even a DNA scientist will tell you about error rates. It is impossible to judge the value of a given technology without knowing the error rate produced when that technology is used in the field. Second, the technology claims to identify an image as originating with another image based on the similar hash value, but how do we know that the original image is authentic? What if that original image was photoshopped to begin with? And since this whole thing is driven by Microsoft software, does anyone really believe that the software implementation of the hash algorithms are 100% bug free?

The problem is compounded by the fact that neither Microsoft nor NetClean makes any of this technology available to the defense, just to law enforcement. Why on earth would anyone interested in getting at the TRUTH want to withhold from the only community that is charged with keeping the government honest? It’s shameful when companies discriminate against the defense and demonstrate their indifference to the real, well-documented reality that innocent people ARE convicted of crimes they did not commit. That’s what real DNA has taught us. This faux DNA, that seeks to ride the coattails of real science, is a good start at helping eradicate a very real problem in our society, but it becomes dangerous when its advocates refuse to shine the harsh light of scientific scrutiny on it. What are they afraid of?

See the joint Microsoft-NetClean video on PhotoDNA  here.

– RP


Still in trial

The defense will be resting this week (Wed or Thu) and I expect a verdict will be returned shortly after the jury gets the case, probably by end of day Friday. We are out of session today and resuming tomorrow. Until the verdict is in, you know where my thoughts will be. Thanks.

– RP

Out to Trial …

Just a quick note to let folks know that I’ve started a 4 week murder trial that will occupy every moment of my spare time between now and the verdict. Until then, thanks for reading and I’ll see you on other side.

– RP

“Predictive Policing” – Geographical Profiling?


They’re calling it Predictive Policing, but is it really just another form of statistical profiling? Police in New York and Los Angeles are using software called (predictably) “PredPol” as part of their daily briefing to help locate crime “hot spots” for the day. The software creates a map of the city being watched, marking it up with small red 500-by-500 feet squares where crimes are “likely” to happen. The software claims to use more than just a database of past crimes and adds what it calls “sociological information” to help forecast likely spots where cars may be stolen, houses burglarized or people mugged. The company does not elaborate on what kind of “sociological information” is worked into the algorithms, but the term “sociological” implies that assumptions about the behaviors of people are being woven into the mix at some point.

At first blush, this sounds like a reasonable way to gain insights from past data about where crimes are “likely” to happen. And to the extent that police presence deters crime, this could have beneficial effects. But it’s also true that arrests can only occur where police are present, which suggests that any area where police are concentrated is going to have more arrest activity. Over time, will this become a self-fulfilling prophecy as police make more and more arrests in the areas that have been targeted based on “sociological information”, which will in turn give more statistical weight to these areas as crime “hot spots”?

When applying for search warrants (which must be based on “probable cause”), police often cite as a basis that the area to be searched is in a “high crime area”. Don’t be surprised to see search warrants start reciting that the area to be searched was targeted by PredPol as an area where crimes are “likely” to occur. When that happens, we will be one step closer to diluting the notion of a “detached magistrate” and relying on computers to make the independent judgments that need to be made in these highly subjective calls.

And just wait till real estate agents start using this data to steer people away from crime “hot spots”. Think about that next time you consider reporting a crime to police.

For more, see this story.


Hello, World!


Robert Perez, Esq.

Defensology was created in July of 2006 as a forum for the discussion of issues related to the intersection of technology and criminal defense.  My name is Robert Perez. I’m a trial lawyer and I practice criminal law in the greater Seattle region along with my daughter, Sarah J. Perez. You can read about our practice on our website. As a software engineer and trial lawyer, I have a great interest in this topic and have always enjoyed writing about it. The blog was a great outlet for highlighting trends of interest and the competing fears and “ooooos and ahhhhs” over the emerging capabilities of science (and junk science) as applied to law.

Due to the heavy demands of a growing practice, the site had to be retired after a brief, but flambouyant lifetime. I’m pleased to be back writing again (no, the business hasn’t slowed, but I have better help now). Thanks to all of you who have supported the effort and bugged me to get back at it. It’s going to be fun and there is even more to talk about than ever.

If you wish to contact me directly, please leave a message at www.RobertPerezLaw.com on the Contacts page.


– RP