HTTP Anti-Virus Proxy
http://havp.hege.li/forum/

Can you disable templates?
http://havp.hege.li/forum/viewtopic.php?f=3&t=195
Page 1 of 1

Author:  JFlanders [ 21 Dec 2006 00:48 ]
Post subject:  Can you disable templates?

I have a somewhat odd question is it possible to disable some of the templates HAVP uses. Our company has web scraper and downloaders that pass through HAVP to help prevent nasties from getting to, before we were using Squid with Dansguardian + ClamAV. When our ISP would go down the scrapers wouldn't get a response from a site and go into a wait state until it came back up. But because HAVP returns the dns.html when it can't get a site the scraper considers this valid and continues to work downloading hundreds of dns.html until its fixed. Is there a way to turn this off? I tried renaming the dns.html but got
HAVP could not open Template! Check errorlog and config!
And I'm afraid that the systems will still accept this as a valid page.
Would wrapping HAVP in Squid prevent this or would it still respond?

Author:  hege [ 28 Dec 2006 17:17 ]
Post subject: 

Does that scraper have "MSIE" in it's User-Agent string? When HAVP sends error it has workaround to send 200 code for IE. Normally it sends 403 which is real error code.

You could try edit proxyhandler.cpp and change this to 403:

Code:
Code = "200";


Cheers,
Henrik

Author:  JFlanders [ 03 Jan 2007 22:53 ]
Post subject: 

Thanks our developers found another work around where our scraper would recognize a page of <50 characters as an error and mark it properly rather then considering it good so we changed the dns.html to be much smaller. But if for some reason this doesn't work out as we hope I'll try what you suggested.

Author:  JFlanders [ 03 Jan 2007 23:26 ]
Post subject: 

While troubleshooting another issue I tried editing the proxyhandler.cpp as you suggested changing 200 to 403 and recompiling HAVP it made no difference, still returning the dns.html when it couldn't resolve an address (how I simulated a down ISP) The scrapers use IE to do there work so I used IE7 to test it assuming this would send the MSIE string. For now the small dns.html will at least keep it from considering a site is good but it would be nice if it would just let the browser return its own error so the scraper will actually pause until the internet returned. Am I doing something wrong?

Author:  hege [ 03 Jan 2007 23:31 ]
Post subject: 

Could be your scraper doesn't realize 403 as an error, like it should..

Author:  JFlanders [ 03 Jan 2007 23:36 ]
Post subject: 

The test was done with raw IE7 not the scraper.. maybe I'm not understanding browser errors... Is it recieving the 403 and the dns.html? Or maybe I'm not testing it properly, tonight when the proxies aren't so busy I'll unplug the nic that connects to the ISP and see what kind of response I get rather then just resolving a known bad address.

Author:  hege [ 04 Jan 2007 03:10 ]
Post subject: 

403 is HTTP "forbidden" status code. dns.html is the error body (description).

The scaper is faulty if it tries to retry a forbidden site.

Author:  JFlanders [ 04 Jan 2007 04:02 ]
Post subject: 

I understand and yes it does work.. I unplugged the nic to the internet on the proxy and ran the scraper using HAVP compiled with the modified proxyhander.cpp and it recognized the 403 and paused like its suppose to thanks :)

Page 1 of 1 All times are UTC + 2 hours [ DST ]
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/