An Interest In:
Web News this Week
- April 3, 2024
- April 2, 2024
- April 1, 2024
- March 31, 2024
- March 30, 2024
- March 29, 2024
- March 28, 2024
Scraping Linkedin Data with Proxycurl, Python Program, and Nodejs
Today, I want to show you how you can scrape data from linkedin using Proxycurl api, Python programming and nodejs.
Let's scrape data using python programming and the library request.
I am going to use the Proxycurl Company api to get the Employee Count Endpoint
install the package request
!pip install requests
let's get our Proxycurl api create an account with Proxycurl and generate your api.
Let's count the number of employees working at Apple.inc
Using the library
import requestsapi_endpoint ='https://nubela.co/proxycurl/api/linkedin/company/employees/count/'api_key = 'YOUR_API_KEY_HERE'header_dic = {'Authorization': 'Bearer ' + api_key}params = { 'linkedin_employee_count': 'include', 'employment_status': 'current', 'url': 'https://www.linkedin.com/company/apple/',}response = requests.get(api_endpoint, params=params, headers=header_dic)
The output response is:
{
'total_employee': 94262,
'linkedin_employee_count': 567686,
'linkdb_employee_count': 94262
}
Let's try to count the number of employees working at twitter
import requestsapi_endpoint = 'https://nubela.co/proxycurl/api/linkedin/company/employees/count/'api_key = '3HqZGXdoejPB8YYT4KRb3w'header_dic = {'Authorization': 'Bearer ' + api_key}params = { 'linkedin_employee_count': 'include', 'employment_status': 'current', 'url': 'https://www.linkedin.com/company/twitter/',}response = requests.get(api_endpoint, params=params, headers=header_dic)
The output is
{'total_employee': 7472,
'linkedin_employee_count': 7992,
'linkdb_employee_count': 7472
}
You can try this with as many companies as possible
Next let's try scraping data from linkedin using Proxycurl and Nodejs
- Create a folder directory
cd c:\\User\user\Folder name
- Build file package
npm install express axios dotenvor with Yarnyarn add express axios dotenv
- Generate API key from proxycurl
API_KEY = 'YOUR_API_KEY_HERE'
- Code snippet
import express from 'express';import axios from 'axios';import dotenv from 'dotenv';const app = express();dotenv.config();app.listen(8000, () => { console.log('App connected successfully!');});
// Getting Company's job listingconst TWITTER_URL = 'https://www.linkedin.com/company/twitter/'; // Line 1const COMPANY_PROFILE_ENDPOINT = 'https://nubela.co/proxycurl/api/linkedin/company';const JOBS_LISTING_ENDPOINT = 'https://nubela.co/proxycurl/api/v2/linkedin/company/job';const JOB_PROFILE_ENDPOINT = 'https://nubela.co/proxycurl/api/linkedin/job';const companyProfileConfig = { // Line 2 url: COMPANY_PROFILE_ENDPOINT, method: 'get', headers: {'Authorization': 'Bearer ' + process.env.API_KEY}, params: { url: TWITTER_URL }};const getTwitterProfile = async () => { // Line 3 return await axios(companyProfileConfig);}const profile = await getTwitterProfile();const twitterID = profile.data.search_id;console.log('Twitter ID:', twitterID);const jobListingsConfig = { url: JOBS_LISTING_ENDPOINT, method: 'get', headers: {'Authorization': 'Bearer ' + process.env.API_KEY}, params: { search_id: twitterID // Line 4 }}const getTwitterListings = async () => { // Line 5 return await axios(jobListingsConfig);}const jobListings = await getTwitterListings();const jobs = jobListings.data.job;console.log(jobs);
// Specific Job listing code snippetconst jobProfileConfig = { url: JOB_PROFILE_ENDPOINT, method: 'get', headers: { 'Authorization': 'Bearer ' + process.env.API_KEY }, params: { url: jobs[0].job_url // Line 1 }};const getJobDetails = async () => { // Line 2 return await axios(jobProfileConfig);};const jobDetails = await getJobDetails(); console.log(jobDetails.data);
How the package.json should look like;
{ "name": "nubela", "version": "1.0.0", "type": "module", "description": "", "main": "proxycurl.js", "scripts": { "test": "echo \"Error: no test specified\" && exit 1" }, "keywords": [], "author": "", "license": "ISC", "dependencies": { "axios": "^1.1.3", "dotenv": "^16.0.3", "express": "^4.18.2" }}
You can try scraping any data of your choice from Linkedin using Proxycurl Api
References
Proxycurl API
Proxycurl Documentation
Node js
Proxycurl Writer
Original Link: https://dev.to/anuoluwapods/scraping-linkedin-data-with-proxycurl-python-program-nodejs-and-4a9e
Dev To
An online community for sharing and discovering great ideas, having debates, and making friendsMore About this Source Visit Dev To