Email Print Share

"Disclaimer Explainer" -- The Discovery Files

The Discovery Files
Audio Play Audio
The Discovery Files podcast is available through iTunes or you can add the RSS feed to your podcast receiver. You can also access the series via AudioNow® by calling 641-552-8180 on any telephone.

A team of scientists, led by Carnegie Mellon University, used artificial intelligence (AI) algorithms to crawl 7,000 of the most popular websites' privacy policies and identify those that contain language about data collection and use, third-party sharing, data retention and user choice, among other privacy issues. The project website enables people to navigate machine-annotated privacy policies and jump directly to statements of interest to them, including those often buried deep in the text of privacy policies.

Credit: NSF/Karson Productions

Audio Transcript:

Long story short.

I'm Bob Karson with the Discovery Files, from the National Science Foundation.

"Click if you have read and understand our privacy policy."
OK, I admit I have played "privacy policy roulette" (Sound effect: roulette wheel) by clicking through and having no idea what permissions I have just given some company. Even if I tried to be good and actually read these sometimes vague and hard to understand legal disclaimers, (Sound effect: stopwatch ticks) experts estimate it would take me 244 hours a year.

There's a new interactive website -- part of the Usable Privacy Policy Project -- that uses crowd-sourcing, machine learning, and natural language processing techniques to separate the mumbo (Sound effect: sound) from the jumbo (Sound effect: elephant), and show just the things users want to know about. The site's called

It's part of a collaborative effort led by a team at Carnegie Mellon University that trained artificial intelligence (AI) by having human law students manually annotate 115 privacy policies. The AI learned from that, then crawled 7000 of the most popular sites' policies for language on data collection and use, third-party sharing and other privacy issues.

The team hopes to turn this all into a simple browser plug-in that would provide users with a personalized summary of issues they're most likely to care about. (Sound effect: shower curtain, shower in bg) Now, if you don't mind, I'd like some privacy.

"The discovery files" covers projects funded by the government's National Science Foundation. Federally sponsored research -- brought to you, by you! Learn more at or on our podcast.

General Restrictions:
Images and other media in the National Science Foundation Multimedia Gallery are available for use in print and electronic material by NSF employees, members of the media, university staff, teachers and the general public. All media in the gallery are intended for personal, educational and nonprofit/non-commercial use only.

Images credited to the National Science Foundation, a federal agency, are in the public domain. The images were created by employees of the United States Government as part of their official duties or prepared by contractors as "works for hire" for NSF. You may freely use NSF-credited images and, at your discretion, credit NSF with a "Courtesy: National Science Foundation" notation. Additional information about general usage can be found in Conditions.

Also Available:
Download the high-resolution JPG version of the image. (66.6 KB)

Use your mouse to right-click (Mac users may need to Ctrl-click) the link above and choose the option that will save the file or target to your computer.

MP3 icon
NSF podcasts are in mp3 format for easy download to desktop and laptops, as well as mobile devices capable of playing them.