Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


In this Discussion

How to block un-necessary traffic wit HubSpot Webcrawler in it?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

How to block un-necessary traffic wit HubSpot Webcrawler in it?

ahmiqahmiq Member
edited August 2014 in Help

Hi ,
I have seen a spike in the web traffic on my wordpress site and by checking the access logs , i see the following type:

54.226.254.1 - - [10/Aug/2014:15:04:15 +0200] "GET /wp-includes/js/jquery/jquery-migrate.min.js?ver=1.2.1 HTTP/1.0" 404 1182 "xxxxx" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36 HubSpot Webcrawler"
54.226.254.1 - - [10/Aug/2014:15:04:15 +0200] "GET /wp-content/plugins/crayon-syntax-highlighter/js/min/crayon.min.js?ver=2.6.5 HTTP/1.0" 404 1182 "xxxx" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36 HubSpot Webcrawler"

All with the: HubSpot Webcrawler , Is there anyway , i can block traffic coming with HubSpot Webcrawler?

i am using vestaCP with centos

Comments

  • ahmiqahmiq Member
    edited August 2014

    Okay if someone else is looking for the solution , I tried this with .htaccess and it seems to be working cause i dont see anymore bot like that. Will keep mosttly unwanted bots away

    ##begin code 
    ##start blocking potentially unwanted bots. 
    RewriteEngine On 
    RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:[email protected] [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Custo [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^HMView [OR] 
    RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR] 
    RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^larbin [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Wget [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Widow [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Zeus 
    RewriteCond %{HTTP_USER_AGENT} ^HubSpot
    RewriteRule ^.* - [F,L] 
    ##end code. bai bots.
    
Sign In or Register to comment.