Ok, so first off apologies to everyone trying to get ahold of me. I was quite sick for a few days, and lost power and internet for several days almost immediately after. And I’m going to the northern Michigan boonies this weekend(Tawas/Oscoda area), so I may not be around. But starting next week I will be, barring some other random catastrophe. Now, to get back on topic.
I’ve yet to meet an active blackhat that had no use for proxies(unless they control a massive amount of IPs). And since I’ve not had time to check rankings and whatnot on my test sites, I otherwise wouldn’t have much for you guys. So today? We learn the ins and outs of scanning for proxies. Truly a lost art. Use these legally, because many keep logs or are honeypots. And before people ask, these will NOT connect to send email(for the most part) and I will not help with that, so yeah.
Disclaimer: This pisses off a lot of ISPs. Make sure you’re not on one that cares, and don’t ever run this on a critical server in case it locks up. I’m well aware this is not an SEO article, but thought it could be useful. And while the information is legal to give out, and there are legal uses for this, there is also illegal things people can do with proxies. So know your laws.
I know you advanced guys are cringing right now. But bear with me. Ok. So true (non cgi/php) proxies come in a few varieties. HTTP(connect to port 80), HTTPS(connect to ports 80,443,sometimes others) and SOCKS4/SOCKS5(can connect to any port, but have unique rules).
What are These For?
Hiding your arse. Anonymous/Elite proxies do not pass your IP to the sites you visit. So if you wanted to go to say, Google, but didn’t want Google to know you visited, you would connect to one of these, they would connect to Google, and then send you the resulting page. So they end up being used for everything from Scraping google listings to competitive analysis to (yes, it’s true) comment spam.
Proxy Scanning – The Lost Art
For those of you who don’t know, proxy scanning is pretty much having software connect to every IP in a range on a given port to test them to see if they have SQUID or another proxy installed.
- Servers – Make sure your server host isn’t going to flip out about port scanning. Also make sure your testing software doesn’t try and connect to port 25, or else you’ll be branded as an email spammer which your host will not like.
- IP Ranges – You’re not just leeching. You’re scanning. Which means you need to decide on some IP ranges. Many IP ranges are dead in the water, and have nothing on them. Weed these out as time goes on. But keep in mind your proxy scanner should be running at minimum 500 threads/sockets at a time or you’re just going to burn a lot of time.
Anywho, to find IP ranges that are not dead, visit http://www.spamcop.net/w3m?action=map (not loading great on FF right now). It’s a map of e-mail spammers activity by IP range. But it’s also a great list of ISPs that have dirty netspaces. And hence more proxies to use.
To get you started, 18.104.22.168-22.214.171.124 has historically been good.
60-126.96.36.199 is largely US proxies, and active.
188.8.131.52-184.108.40.206 is a moderately active range, and can optionally extend to 220.127.116.11
Really, almost any ips between 18.104.22.168-22.214.171.124 are at least decent. The DoD owns a few blocks there (a 20X.0.0.0 block I think) though, so make sure to avoid such nastiness.
When selecting ports to scan for proxies on, it’s a question of quality to quantity. 8080 for example, is really common. So a lot of people scan there and the proxies are lower quality. 6588 meanwhile is less common and generally faster. Other ports to take a look at are 80 and 3128. SOCKS proxies have entirely different ports they run on, but I’m not really going to talk about that.
Anyways, some ports are common in certain IP ranges but simply do not exist on other(active) IP ranges. Odd, yeah.
Scanning port 80 will give more complaints than scanning any other, and also tends to slow down scanning software due to the volume of hits.
- How Fast Should I be Scanning?
- However fast you can. ProxyHunter is an old one that can run from 500-2000 sockets at a time, but it slows down and requires a 2 second connect timeout, with a 30 second data transfer timeout. Be aware of the limitations of your software. Sometimes it will pretend to be functional, but in reality is going so fast no connects are going through.
- When I’m giving these numbers, I’m talking about my experience running off of pretty high powered servers back in the day. Do not even attempt these numbers off a cable/DSL line.
- If you’re running a windows server, at 3700 sockets or so chances are it will throttle your connects and either lock up or not allow you any new connections. If this happens, look at changing a few of the lines in SYSTEM\CurrentControlSet\Services\Tcpip\Parameters in the Windows Registry. Certain things (like Max user ports) can help to make windows stop throttling your connection. I’ve seen servers scan at 16.5k concurrent sockets with a 1 second timeout without locking up. So if you’ve got the bandwidth and high quality software, go for it.
Soon as I’m caught up on work I missed while I was out, I’ll get a real SEO article out here. Thanks for the patience all.