In this article we will explain how to rotate proxies for web scraping. Rotating proxies will ensure stable sessions so you can reach your desired targets without issues.
To get started, you can create a virtual environment by running the following command:
This command will help you to install Python, pip, and common libraries in your venv folder.
Use the source command to activate your environment:
Install requests module in the current virtual environment you are using:
Congratulations! You have finished all the steps for the installation of the requests module!
Now, let’s start with the basics. In some cases you might need to connect and use one single IP address or proxy. How do we use a single proxy?These are the essential things that you will need:
Here is an example how the proxy request should look in this case:
You can also select multiple protocols, as well as specify domains where you would like to use a separate proxy.
Replace PROXY1, PROXY2, PROXY3 with your proxy format as shown in the example below:
Make a request using requests.get while providing the variables we created previously:
Your full command should look like this:
The result of this script will provide you with the IP address of your proxy:
You have now taken care of hiding behind a proxy when making requests through the Python script.
Let's learn how to rotate through a list of proxies instead of just using one.
You will work with a list of proxy servers saved as a CSV file called proxies.csv, in which you need to list proxy servers as shown below:
If you want to add more proxies in the file, add each of them in a separate line.
After that, create a Python file and specify the file name and the timeout duration for each single proxy response.
Using the code provided, open the CSV file, read each line of proxy servers into the csv_row variable, and build the scheme_proxy_map configuration.
This is an example of how it should look:
To ensure that everything runs efficiently, we'll use the same scraping code as before, to access the site with proxies.
If you want to scrape content using any working proxy from the list, just add a break after print to stop going through the proxies in the CSV file:
Your full code should look like this:
That's it! Congratulations, you have successfully learned how to rotate proxies using Python.