Euro Security Watch with Mathew J. Schwartz

Data Loss Prevention (DLP) , Governance & Risk Management , Incident & Breach Response

Data Breach Collection Contains 773 Million Unique Emails

2.7 Billion Email/Password Combo List Available for Credential Stuffing, Troy Hunt Warns
Data Breach Collection Contains 773 Million Unique Emails
Troy Hunt's free Have I Been Pwned breach notification service has been updated with 22 million new, unique email addresses contained in a massive collection of 773 million email/password combinations now in circulation.

Editor's Note: This blog has been updated with comments from Alex Holden.

See Also: Webinar | Mythbusting MDR

Is there any email address left that hasn't been leaked in a data breach?

On Thursday, Australian information security expert Troy Hunt warned that a collection of email address and passwords combinations that's currently in circulation contains 2.7 billion rows.

He says the massive collection of breached data, called "Collection #1," appears to have been compiled from a hodgepodge of sources, and contains 773 million unique email addresses.

"It's made up of many different individual data breaches from literally thousands of different sources," Hunt writes in a blog post.

Hunt runs the free Have I Been Pwned service, which enables users to register their email address and receive an alert anytime the email shows up in a data dump that Hunt loads into the service. He says that of the 2.2 million email addresses that users have registered with Have I Been Pwned, about 768,000 of them appear in the Collection #1 breach, and thus his service is sending out that many notifications to affected users.

The name for the collection comes from the name of the root folder storing all of the data, which is contained in more than 12,000 files and totals 87 GB of data. Hunt says he was alerted to the existence of the collection, which was available via the MEGA file-sharing service - it's been removed - and which has since been shared on at least one hacking forum.

Hunt says a contact shared the details of a hacking forum where the collected data, first disseminated via MEGA, was being shared. (Source: Troy Hunt)

Hunt uploaded to Pastebin the names of the more than 2,000 databases that attackers allegedly hacked to compile Collection #1.

"I need to stress 'allegedly,'" Hunt says. "I've written before about what's involved in verifying data breaches and it's often a nontrivial exercise."

While Collection #1 contains 773 million unique email addresses, Hunt says most of them have already appeared in past data breaches. After analyzing all of the data in Collection #1, he says there only appear to be about 22 million email addresses that had not previously been added to Have I Been Pwned based on previous, big data dumps that appeared online.

Old Data Still Works for Phishing, Spam

The data in Collection #1 was originally put up for sale in October 2018 for about $60, says Alex Holden, CISO at Hold Security, a Wisconsin-based consultancy. It was being offered alongside four more collections, which combined with Collection #1 comprised a staggering 993 GB of data, he says.

Holden says his company analyzed all five sets and concluded most of it is old. For example, more than 99 percent of the credentials within Collection #1 have been seen before, he says, which makes it less useful for attacks because of its age.

Large companies, such as Facebook and LinkedIn, will have tuned their intrusion prevent systems to detect login attempts based on the credentials on the lists, Holden says. Lesser-skilled hackers may embrace the credentials in Collection #1, but that's a risky gambit.

"Using bad data is basically a good way to get caught," he says. "Amateurs may be using it, but professionals are staying away from it."

Unfortunately, the hundreds of millions of unique email addresses still remain very usable and useful for running spam campaigns, phishing attacks or even so-called "sextortion" campaigns.

"This is a phishing paradise," Holden says.

Credential-Stuffing Toolkit

Another likely use for all of this data is for credential-stuffing attacks, which is the practice of taking username/password combinations and trying them out on other websites to see where they work.

If an individual reuses the same email address and password combination on multiple sites, so can attackers.

Last week, for example, many people suspected that streaming service Spotify had suffered a breach, because of lists of "Spotify" usernames and passwords that were being published to text-sharing sites such as Pastebin.

But Hunt said that based on the passwords being shared, it was unlikely Spotify had been hacked. Rather, it appeared that attackers had been using lists of email addresses obtained elsewhere and testing various passwords - especially weak ones - to see which ones gave them access to accounts.

Billions and Billions of Combos

Top 10 biggest data breaches for which leaked usernames and passwords have been loaded onto Have I Been Pwned as of Jan. 17, 2019

In an interview last summer at the Infosecurity Europe conference in London, Hunt said the problem was that too many people pick a favorite password and then reuse it across multiple sites. As a result, if site A got hacked, attackers could take the stolen credentials and use them to access someone's account at site B, even though site B was never hacked (see: Credential Stuffing Attacks: How to Combat Reused Passwords).

"This is where I'm a little bit sympathetic," Hunt told me. "This website B didn't necessarily do anything wrong, but now they've got to deal with the risk of ... an attacker logging in with a victim's credentials, and that's a really hard problem."

Billions of these email/password combos are now in circulation.

To help, Hunt has created a service called Pwned Passwords, which can be used via API or downloaded. Sites can use it to see if a password that a user is trying to pick has been seen in previous breaches. If so, the site or service - such as a password manager - can recommend that a user pick a different password.

Solution: Password Managers

Hunt says the obvious takeaway from the Collection #1 data breach is that everyone should be using a different password for every different site or service they use. That way, if it gets breached - and they get a notification that their username/password combo was pwned - they need only change that one password.

"If you're in this breach and not already using a dedicated password manager, the best thing you can do right now is go out and get one," Hunt says. "A password manager provides you with a secure vault for all your secrets to be stored in - not just passwords; I store things like credit card and banking info in mine too - and its sole purpose is to focus on keeping them safe and secure."

He's not alone in recommending that everyone use a password manager to create and store unique passwords for every different service.

"I also personally recommend using a password manager - it means you can have truly strong passwords that you'll never remember, which are always the best passwords," cybercrime expert Alan Woodward, a computer science professor at the University of Surrey, told me last week. "Yes, password managers can be a single point of failure, but choose a really strong password for that and you have to remember only the one."

Woodward offered his advice in the wake of a report that many German politicians and celebrities regularly used weak passwords for their email accounts, social networks and cloud service subscriptions (see: Why Are We So Stupid About Passwords? German Edition).

Or At Least Keep a Notebook

Even if you're not a politician or celebrity, however, you're still at risk. That's why Hunt says the need to use unique, relatively strong passwords trumps almost all other online security considerations.

"If a digital password manager is too big a leap to take, go old school and get an analogue one (aka, a notebook)," he says. "It might be contrary to traditional thinking, but writing unique passwords down in a book and keeping them inside your physically locked house is a damn sight better than reusing the same one all over the web."

Managing Editor Jeremy Kirk also contributed to this blog.

About the Author

Mathew J. Schwartz

Mathew J. Schwartz

Executive Editor, DataBreachToday & Europe, ISMG

Schwartz is an award-winning journalist with two decades of experience in magazines, newspapers and electronic media. He has covered the information security and privacy sector throughout his career. Before joining Information Security Media Group in 2014, where he now serves as the executive editor, DataBreachToday and for European news coverage, Schwartz was the information security beat reporter for InformationWeek and a frequent contributor to DarkReading, among other publications. He lives in Scotland.

Around the Network

Our website uses cookies. Cookies enable us to provide the best experience possible and help us understand how visitors use our website. By browsing, you agree to our use of cookies.