Proxy-like integration
This is a proxy lookalike integration. These proxies act as standard ones with added functionality.
– 100% success rate plus accurate data.
– Data sources: direct (full URL provided by the client).
– Requires an open connection to send the acquired results.
– Single query – no batches.
– Supports any SERP keyword.
– Parsing: raw HTML, and in some cases – structured JSON.
Endpoint: scrape.smartproxy.com:60000
scrape.smartproxy.com:60000
Integration examples:
curl -k -x scrape.smartproxy.com:60000 -U username:password -H "X-Smartproxy-Device-Type: desktop_firefox" -H "X-Smartproxy-Geo: California,United States" "https://www.google.com/search?q=world"
<?php
$ch = curl_init();
$username = 'username';
$password = 'password';
$options = [
CURLOPT_URL => 'https://www.google.com/search?q=world',
CURLOPT_PROXY => 'scrape.smartproxy.com:60000',
CURLOPT_PROXYUSERPWD => sprintf('%s:%s', $username, $password),
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_SSL_VERIFYPEER => false,
CURLOPT_SSL_VERIFYHOST => false,
];
curl_setopt_array($ch, $options);
$result = curl_exec($ch);
if (curl_errno($ch)) {
echo sprintf('Error %s', curl_error($ch));
} else {
echo $result;
}
curl_close ($ch);
?>
import requests
username = 'username'
password = 'password'
proxy = 'http://{}:{}@scrape.smartproxy.com:60000'.format(
username, password)
headers = {'X-Smartproxy-Device-Type': 'desktop_chrome',
'X-Smartproxy-Geo': 'New York,New York,United States',
'X-Smartproxy-Parse': '1',
}
response = requests.request(
'GET',
'https://www.google.com/search?q=world',
proxies={'http': proxy, 'https': proxy},
headers=headers,
verify=False
)
print(response.text)
Direct data source only
A direct source is when the URL is fully provided.
If we are able to parse your targeted source, JSON (structured data) is supported.
How to use it
- Give us the full targeted Google URL and (if requested) add parameters in the headers. The authorization is straightforward,
user:pass
.
Parameters
The parameters that this integration supports should be sent as headers. Why so? The header is a part of the HTTP communication. If you want to go against the grain and provide additional JSON parameters, you can only do so through the header. You can find more information about this next to the parameters.
- Ignore the certificates.
Certificates
Our certificate helps the proxy see the full URL (the target).
Certificate issues – the proxy gets our certificates instead of the Google ones, so it has to ignore the differences while making the requests. In cURL, it's -k.
-
These proxies scrape everything just like the standard ones, except that Scraping API has a 100% success rate.
-
We need an open connection in order to return the requested data. The data should come back with the HTTP status code 200, and it should be parsed in JSON format or contain raw HTML.
Keep an open connection
If the connection is closed before the job is completed, the data is lost.
The timeout limit for open connections is 150 seconds. In a rare case of a heavy load, we may not be able to get the data to you.
Need any help with your setup? Drop us a line via chat.
Updated 10 months ago