Cloudflare & You: A Little Known Website May Have Exposed Your Data

As an IT person, there’s a point when it seems like there isn’t much that can really shake up the industry. And then something comes along that does. From exciting new hardware to frightening new viruses, there does seem to occasionally be that thing. The current thing has been dubbed Cloudbleed, in reference to a memory leak in Cloudflare’s code that hemorrhaged out data.

Who is Cloudflare? What do they do?

Cloudflare is a Content Delivery Network (CDN) service provider.

Cloudflare CDN vs without CDN comparison map showing 31ms vs 9ms averaged page load times for users in various locations in the us.

Cloudflare’s description of how their CDN service works and the improvement it makes in website load times.

CDNs are used to help websites operate globally by replicating a site to servers all over the world. Essentially, you pay a CDN to copy your website so that a user in Dubai has the same experience as users in New York, NY or someone three blocks from your server. One CDN server may encompass multiple sites, and there’s no real division between what sites are contained on that server. They don’t have a “kids site” server, a “porn site” server, “business site” server, “wholesome non-profit” server, etc. all of it is stored together.

What happened?

A memory leak affecting as many as 1 in about 3.3M requests to their service. This memory leak responded to that 1 in 3.3M request with everything in memory at that moment. So, usernames, passwords, dirty pictures, social security and credit card numbers, everything and anything the server was handling at that moment could have been passed back as a response to any other request.

One of the oldest coding problems there is, a memory leak occurs when a process exceeds the bounds of it’s appropriated memory space, not unlike a child climbing over a child gate. Much like people put up child gates to keep the child out of danger or to keep things, like pets, away from the child, the same is true for how memory is handled. Memory space for one process should never touch another unless you specifically intend for that to happen. Otherwise, the results can be catastrophic, as seen in this issue.

1 in 3.3M doesn’t sound catastrophic, what’s the big deal?

Loading a webpage can easily involve hundreds of requests, and Cloudflare has LOTS of users. A request is the term that describes every single time your computer connects to a website to obtain content or perform an action. Just loading a website involves a request for the page, a request for every image on the page, etc. A request also occurs every time that you submit data (fill out a form, make a post, etc.) on a website. Now multiply that by millions and millions of requests over millions of websites and the fact this issue has been happening since September 2016. YIKES!

Google presentation on how search engines work.

Additionally, the internet is a collection of millions of sites and trillions of pages, and search engines are constantly crawling through that data to index it and make it searchable.  For more in depth explanation, Google’s Inside Search – How Search Works gives a good overview of how search engines help you find the things you are looking for online.

In terms of CDNs, they serve the data that search engines are indexing, which means they have indexed the memory leaked data. Currently search companies like Google, Yahoo, Bing, Duck-Duck-Go, etc. are working to find and remove leaked data.

Do I have your attention? Good!

You may remember the big news was about Heartland’s data breach. A company that no one had really heard about that processed credit card transactions was suddenly major news due to a data breach. Hackers had accessed and stolen credit card data from the company, and their clients included TJ Maxx and Marshall’s. This is like that, only several orders of magnitude larger. Cloudflare does for webpages what Heartland did for processing credit cards, it handled the data. What makes them different is that Heartland’s issue allowed hackers to only access their database, a neat and tidy intrusion that could be quantified and traced. Cloudflare’s issue was not an intrusion, it was an error with their systems that randomly spewed data like a busted water main. It’s going to be next to impossible to nail down exactly whose data is affected, what was exposed, and how much of that has been stored or how much has been/can be/will be used.

The big deal is that search engines index site content. They index this content by sending requests to servers and storing the results and then parsing that to make it searchable. That, in a nutshell, is a “search engine”. Is a search engine more complicated than that? ABSOLUTELY. But that’s the gist of what one does. Search engines generate TONS of requests and store TONS of data. This bug has potentially compromised TONS of data, including pictures, emails, photos, posts, messages, etc. that was being handled for sites by Cloudflare. The really big problem is in not knowing what data was exposed, so one has to assume (for data security purposed) than anything these sites hosted has potentially been exposed. And depending on the purpose of these sites, the potential exposure could be massive.

How Do You Use the Affected Site?

What you need to consider, in addition to the effect on your personal information and your personal online security, is what you use the affected site for. If you use an affected site for sending emails or letters on your behalf to a large group of users, if you use it to store information regarding other users within your organization, if you use that site to store information about clients or people you serve, that data may have been exposed. Perhaps you keep a list of donors for a charity. Maybe you use it to track information for an upcoming project for your company, something that isn’t public knowledge yet. There are lots of ways in which compromising private data can have a reach much further than the individual.

For example, let’s say you use an affected site to store a contact list (name, address, phone numbers, emails, etc.) for a group event or club, not only is your personal information like email and password potentially exposed, so is everything you were doing on the site. That entire list of data may have been compromised. You may need to warn others of this exposure (much like TJ Maxx and Marshall’s had to alert customers after the Heartland Data Breach) of user information and data.

What you should expect:

Emails referencing Cloudflare from sites that you use. And you may possibly see things that you did on these sites exposed to the world. At this point, what is known is how little is known about the size and scope of data affected.

What you should do:

CHANGE YOUR PASSWORDS!

I can’t say this enough. While this won’t stop the problem with the data that has gotten out, it will prevent any username/email address and password combinations from being used to compromise other sites you use. As many people use the same username/email address and password combination (meusername@mydomain.com or meusername in combination with mytotallyunguessablepassword1234) on multiple sites, this means any one exposure of a known good combination is a wide-open door for hackers. If this is the same username/email address and password combination that you use on major sites (Facebook, Twitter, Instagram, etc.) these sites are not only at risk, but are valuable targets for hackers. Any known good combination, in the wrong hands, will be tried on these sites.

Posted in Technology and tagged , , , .