LinkedIn is the world’s largest professional network with over 800 million members. With so much publicly available information on LinkedIn, it can be tempting for companies or individuals to scrape data from LinkedIn profiles and use it for their own purposes.
What is web scraping?
Web scraping refers to the automated collection of data from websites. This is done using bots, scripts or web scrapers that can extract large amounts of data quickly. The data may include text, images, videos, PDFs and more. Web scrapers access data that is publicly visible on websites, so no login credentials are needed. The scraped data is then exported into a usable format like a spreadsheet.
Some common uses of web scraping include:
- Price monitoring – Track prices for products or services on ecommerce sites.
- Lead generation – Build marketing and sales lead lists from business directories or listings.
- Market research – Gather data to analyze consumer trends, demographics, online behavior etc.
- News monitoring – Track mentions of keywords, brands, people etc. across the web.
- Recruitment research – Search profiles on professional networks like LinkedIn.
Web scraping can be done manually but it is extremely labor-intensive and time consuming. Automated scraping using bots allows large volumes of data to be extracted faster and efficiently. However, companies and individuals should be aware of the legal considerations before scraping data, especially from sources like LinkedIn.
Is web scraping legal?
The legality of web scraping really depends on how the data is being accessed and used. Here are some key factors:
Public vs private data
If a website explicitly requires registration, login or authentication to access certain content or data, then that would be considered private data. Scraping private user data is illegal under anti-hacking laws like the Computer Fraud and Abuse Act (CFAA).
However, if the data is publicly accessible without needing to login, then that data is fair game for scraping. The data that any unauthenticated internet user can view on a site is public data.
Website terms of use
Many websites prohibit scraping in their terms of service or use. Accessing or using a website constitutes implicit agreement to the website’s terms. If the terms specifically forbid data collection via automated scraping, bots or other means, then it can be considered a violation.
Data protection laws
Depending on the type of data, there may be laws like GDPR that regulate how personal data can be processed and used. Even if public, things like names, email addresses, locations etc of individuals may come under data protection rules.
Copyrighted content
Websites usually own copyright on their content, design and databases. Scraping substantial copyrighted contents from a site without permission could constitute copyright infringement.
So in summary, scraping public data that does not include private user info, personal data or copyrighted content, from sites without clear anti-scraping terms, is generally legal.
Is scraping LinkedIn legal?
The most important things that determine if LinkedIn scraping is legal are:
- Whether the profile data being scraped is public or restricted to viewing only by direct connections.
- If LinkedIn’s Terms of Service permit scraping.
- Whether the scraped profile data contains personally identifiable information.
- How the scraped data will be used – for commercial purposes or non-commercial research etc.
Let’s look at each of these factors:
Public vs private profile data
LinkedIn profiles have a visibility setting that allows users to specify whether their profile is visible to the general public or only to their direct connections.
Scraping only publicly visible profile data is permissible while scraping private profile data restricted to connections only would be prohibited.
LinkedIn’s terms of service
LinkedIn’s Terms of Service explicitly prohibit scraping as per the section on “Restrictions on Content” which states:
“You agree that you will not under any circumstances:
(…)
– Copy, use, disclose or distribute any information obtained from the Services, whether directly or through third parties (such as search engines), without the consent of LinkedIn.”
So even public information on LinkedIn profiles may not be scraped without permission according to the terms. Violating the terms could result in LinkedIn sending a cease and desist notice or blocking the scraper IP address.
Personally identifiable information
Basic profile information like name, job title, company and location may be technically public but still counts as personally identifiable information.
Scraping and republishing such data on people without their consent could violate privacy laws. Even de-identified aggregated data could be problematic if it causally reveals personal data when combined with other information.
Commercial vs non-commercial use
If scraped LinkedIn data is used for commercial purposes like targeted marketing, lead generation or recruitment, it could lead to legal issues or accusations of unfair use, unless the individuals expressly allowed it.
However, scrapers may try to argue fair use defenses for using public profiles data for non-commercial purposes like academic studies, research, journalism etc. Although, even these uses may still violate LinkedIn’s terms.
Legal cases involving LinkedIn scraping
There have been a few notable legal cases in the past that dealt with accusations of unauthorized scraping of user data from LinkedIn:
LinkedIn vs. HiQ Labs
In 2017, LinkedIn sent a cease-and-desist letter to data analytics startup HiQ Labs asking them to stop scraping public LinkedIn profile data. LinkedIn claimed it violated their terms of service.
HiQ Labs in turn filed an injunction saying the scrape data was publicly accessible and they used it only for non-commercial purposes like analyzing employee attrition rates.
In 2019, the 9th U.S. Circuit Court of Appeals ruled in favor of HiQ Labs stating that the scraping of user data from publicly viewable pages cannot be legally obstructed.
Facebook vs. BrandTotal
In 2020, Facebook filed a lawsuit against start-up BrandTotal alleging that they violated Terms of Service by scraping personal data of Facebook users for marketing purposes.
BrandTotal claimed fair use but a judge granted Facebook’s request for a preliminary injunction to stop BrandTotal. The case was eventually settled in 2021 with BrandTotal agreeing to stop accessing Facebook services and destroying all Facebook user data they collected.
FOX News vs. TVEyes
Media monitoring service TVEyes scraped and indexed TV and radio clips to provide searchable transcripts to customers. FOX News sued them in 2013 alleging copyright infringement.
Though TVEyes claimed fair use, in 2018 the appeals court ruled that TVEyes did not sufficiently transform the copyrighted content and hence their commercial scraping was not defensible as fair use.
Conclusion
Based on LinkedIn’s terms expressly prohibiting scraping, legal precedents like HiQ v. LinkedIn, and privacy considerations around public personal data – it appears that most scraping of user profiles on LinkedIn, even restricted to just publicly viewable data, would be forbidden without explicit permission.
Scrapers could try to mount fair use defenses in some specific non-commercial research or free speech cases, but commercial usage would be difficult to justify legally. Hence, consult an attorney before embarking on any web scraping project targeting LinkedIn.
Overall, while a lot of data on LinkedIn profiles is technically public, the combination of Terms of Service violations, privacy laws, and precedent cases does not bode well for most scrapers. Proceed with caution.