Mining publicly accessible online data—a tactic used by data aggregators to compile information and by sourcers to unearth candidate leads—is legal, a U.S. appeals court has ruled for the second time.
The 9th U.S. Circuit Court of Appeals on April 18 reaffirmed its 2019 decision that the Computer Fraud and Abuse Act (CFAA), a federal anti-hacking law that prohibits unauthorized access to computer systems, does not apply to websites that are open to the public.
Specifically, the court ruled in hiQ Labs, Inc. v. LinkedIn Corp that global professional networking site LinkedIn could not block San Francisco-based data analytics firm hiQ Labs from accessing publicly visible LinkedIn member profiles in order to analyze and prepare employee attrition reports.
The ruling is the latest in a long-running legal battle going back to 2017 when LinkedIn sent hiQ a cease-and-desist letter to prevent it from accessing the site, arguing that scraping data would violate users' privacy and the terms of the site. LinkedIn has previously sought to stop other companies from mining data from its site.
"We're disappointed in the court's decision," said LinkedIn spokesperson Greg Snapper in a statement. "On LinkedIn, our members trust us with their information, which is why we prohibit unauthorized scraping on our platform. When your data is taken without permission and used in ways you haven't agreed to, that's not okay."
At a district court hearing in 2017, the counsel for hiQ inferred that one reason LinkedIn moved against the Web scraper was because LinkedIn was launching its Talent Insights product, which harvests and analyzes workforce data from member profiles, including retention data which "is essentially the same or very similar to [hiQ's product]."
"As a reminder, … there is no federal law that expressly prohibits the practice of data scraping, or extracting data from websites," said Shing Tse, an attorney in the Houston office of
Squire Patton Boggs. "As such, parties seeking to challenge the practice rely on statutes that predate the prevalence of data scraping. One such statute is the CFAA, which forbids individuals from intentionally accessing a protected computer without authorization or exceeding authorized access."
The applicability of the CFAA is the central issue in the case, Tse said. "The crux of hiQ's complaint was that LinkedIn did not have monopoly rights to personal data made publicly available by its users, and that by scraping its website, hiQ did not violate users' privacy rights as LinkedIn alleges," he said.
Legal Background
The 9th Circuit first ruled against LinkedIn in 2019, finding that the CFAA does not bar anyone from scraping data that's publicly accessible. The U.S. Supreme Court ordered the appeals court to vacate its ruling and reconsider the case in light of its own June 2021 CFAA ruling in Van Buren v. United States.
In its ruling, the Supreme Court narrowed what constitutes a violation of the CFAA to those who gain unauthorized access to a computer system rather than a broader interpretation of "exceeding existing authorization," including breaking website terms-of-service agreements.
"The Supreme Court held in Van Buren that an individual who has legitimate access to a computer network but accesses it for an improper or unauthorized purpose does not violate the CFAA," Tse said. "Prior to Van Buren, several circuit courts held that terms-of-service violations could implicate the CFAA. In rejecting this broad interpretation of the CFAA, the court noted that such an interpretation 'would attach criminal penalties to a breathtaking amount of commonplace computer activity.' "
The 9th Circuit's latest decision relied on the Supreme Court's determination in Van Buren that when information is publicly accessible, no authorization to use that data is required.
The appellate court distinguished between access to publicly available profile information on LinkedIn, which cannot be "unauthorized," and access to information on sites which are restricted to users who sign in to the site with a username and password.
Tse said that what it boils down to is that companies that maintain publicly available information on their websites cannot rely on the CFAA to prohibit others from scraping that data, even if the companies subsequently revoke access to the information, or if data scraping is a violation of the websites' terms of use. "Companies must require prior authorization, such as a username and password—to access the data in the first instance—in order for scraping of that data to be actionable under the CFAA," he said.
An organization run by AI is not a futuristic concept. Such technology is already a part of many workplaces and will continue to shape the labor market and HR. Here's how employers and employees can successfully manage generative AI and other AI-powered systems.