一手掌握腾讯招聘数据？玩转Python轻松爬取

前端

2023-01-21 01:26:52

轻松掌握腾讯招聘数据，Python爬取助你一臂之力

一、目标网址剖析

求职竞争激烈，获取及时、准确的职位信息至关重要。作为互联网巨头，腾讯招聘数据备受关注。本文将手把手教你利用Python轻松爬取腾讯招聘数据，助你快速找到理想职位。

首先，我们探究腾讯招聘网站的URL结构。在搜索栏输入关键词（如“Python工程师”），你会发现搜索结果的URL皆以“https://careers.tencent.com/search”开头，后接一串参数，包含搜索条件。

二、代码实现

有了目标网址，即可着手编写Python爬虫代码。借助requests库和BeautifulSoup库，轻松搞定。

首先，导入必要库：

import requests
from bs4 import BeautifulSoup

接着，定义获取搜索结果HTML代码的函数：

def get_html(url):
    response = requests.get(url)
    return response.text

再定义解析HTML代码并提取职位信息的函数：

def parse_html(html):
    soup = BeautifulSoup(html, 'html.parser')
    jobs = soup.find_all('div', class_='job-primary')
    for job in jobs:
        title = job.find('h3', class_='job-primary__title').text
        company = job.find('span', class_='job-primary__company-name').text
        location = job.find('span', class_='job-primary__location').text
        salary = job.find('span', class_='job-primary__salary').text
        experience = job.find('span', class_='job-primary__experience').text
        education = job.find('span', class_='job-primary__education').text
        benefits = job.find('ul', class_='job-primary__benefits').text
        description = job.find('div', class_='job-primary__description').text
        print(title, company, location, salary, experience, education, benefits, description)

最后，组合目标网址和解析函数，即可获取腾讯招聘数据：

url = 'https://careers.tencent.com/search?k=Python工程师'
html = get_html(url)
parse_html(html)

运行代码，控制台即可呈现腾讯招聘网站上所有Python工程师职位的相关信息。

三、数据保存

爬取的数据可保存到本地文件，供后续使用。借助csv库，轻松将数据保存为CSV格式：

import csv

with open('tencent_jobs.csv', 'w', newline='') as csvfile:
    csv_writer = csv.writer(csvfile)
    csv_writer.writerow(['职位名称', '公司名称', '工作地点', '薪资待遇', '工作经验', '学历要求', '公司福利', '职位'])
    csv_writer.writerows(jobs)