Python for Marketers: Making Selenium web scraper click on links
- What this is for: Telling a Selenium web scraper to click on link or enter basic information into a form
- Requirements: Python Anaconda distribution, Basic knowledge of HTML structure and Chrome Inspector tool
- Concepts covered: Selenium
Selenium is an incredibly useful tool for scraping websites with Python, but occasionally your scraper may need to interact with a page before you can access the data you need.
For example, new users to a website may have to click on a modal popup before the page renders, or you may need to enter a zip code to make a query relevant before scraping data.
With Selenium, there are a few simple steps you can add to your script to make the scraper interact with the web page.
Clicking a link
Suppose we need Selenium to click on a link with this markup:
<a id="consent">I agree</a>
First, we’ll import our libraries and launch the webdriver (example uses Chrome)
#import libraries from selenium import webdriver #open webdriver driver = webdriver.Chrome() #open URL driver.get('https://www.example.com/example')
Next, we’ll find our link element and use the click() method to simulate a user click.
#find link and click link = driver.find_element_by_id(‘consent’) link.click()
Filling in form data
Sometimes, a page may require user input before clicking. For example, you may need to enter a zip code into a form to display results.
In this instance, the page may have an input element like this:
<input type="text" id="zip_code" class="form_text">
The setup is very similar, but we’ll also import Keys from selenium
#import libraries from selenium import webdriver from selenium.webdriver.common.keys import Keys #open webdriver driver = webdriver.Chrome() #open URL driver.get('https://www.example.com/example')
Next, we’ll create a variable that stores the zip code and submit it into the form.
#define input string zip = '90210' #input zip code into form form_field = driver.find_element_by_id('zip_code') form_field.send_keys(zip) form_field.submit()