Open Octoparse application.
In the top-left menu hover over the New button and click on Advanced mode. For testing purposes, we are going to create a custom task.
- In the Website field type the website you would like to extract data from. For this test, we are going to use ip.smartproxy.com. Once you do that, hit the Save button.
- You should now appear in your Task tab. To configure our proxies, select the Settings button.
- In the pop-up menu, scroll down to Anti-blocking settings and checkmark the option to Use IP proxies. You should now be able to click on the Settings button.
In the Proxy Settings pop-up, define the proxy you would like to use. Unfortunately, Octoparse only offers IP:PORT based format to authenticate through a proxy network. For that particular reason, you will need to use our Whitelisted IP feature in order to skip the traditional username:password authentication when going through a proxy. You can whitelist your IP after logging into our dashboard here: https://dashboard.smartproxy.com/whitelisted-ips
Once you have your IP:PORT ready, select the Switch interval accodingly to your session type. If you are using a rotating session type, set the interval to 1, if you are using a sticky session, set it to 600. Lastly, hit the OK button.
- To verify if you did everything correctly, check if you are seeing a checkmark next to the Settings option under Anti-blocking settings. Once you verify that, click the Save button to continue.
- To extract data from our example page, click on the IP address which you can see at the top of the Octoparse application and select Extract text of the selected element.
- Once that is done, click on the Save and then Run buttons in this order.
- Depending on how you want to run your task, select one of the available extraction options. For testing purposes, you can Run task on your device.
- If done correctly, after the task finishes running, you should see our proxy IP in the extracted data table.
Updated about 1 year ago