HTTP Anti-Virus Proxy

Official HAVP Support Forum
Registration disabled, I'm tired of spambots. E-mail havp@hege.li if you have questions.
HAVP project is pretty much frozen/abandoned at this time anyway.
It is currently 22 Jun 2014 09:52

All times are UTC + 2 hours [ DST ]




Post new topic Reply to topic  [ 13 posts ] 
Author Message
PostPosted: 13 Mar 2007 18:30 
Offline

Joined: 03 Jun 2006 01:51
Posts: 4
Location: Massachusetts, US
HAVP does not allow the "Transfer-Encoding" header in replies from a web server when HAVP is operating in HTTP/1.0 mode (which I think is all it ever does), and all you get is a HAVP error page.

This may be proper from the point of view of strict HTTP, but it seems that some web servers don't care. In particular, the blog servers at ZDnet.com like to give back "Transfer-Encoding: chunked" headers, so using HAVP makes unavailable a lot of items of interest to software people like me.

To get around this (temporarily, at least) I have modified the "ConnectionToHTTP::AnalyseHeaderLine" method to not disallow this header, and in fact let it go through the (unmodified) "ConnectionToHTTP::PrepareHeaderForBrowser" all the way to the browser.

I don't understand all the implications of "chunked" -- do multiple chunks need multiple HTTP GETs? -- but I gather it allows a Web page to be fragmented at the HTTP layer (just like oversize data may be fragmented at the IP layer). Presumably this means that HAVP wouldn't see the entire page at once, and therefore might not spot a virus that was split over two chunks.

So, is letting "Transfer-Encoding: chunked" through likely to cause problems, or should I strip it out like "ConnectionToHTTP::PrepareHeaderForBrowser" removes the "Keep-Alive" header? (I actually worry more about protocol state-machine confusion than viruses.)

How hard would it be for HAVP to do the "right thing" as suggested by section 19.4.6 of RFC2616 (see for example www.w3.org/Protocols/rfc2616/rfc2616-se ... #sec19.4.6)? Would this introduce unbounded latency in retrieving Web pages?

_________________
Paul Kosinski


Top
 Profile  
 
 Post subject:
PostPosted: 13 Mar 2007 20:05 
Offline
HAVP Developer

Joined: 27 Feb 2006 18:12
Posts: 687
Location: Finland
Chunked is really just a different transfer encoding, nothing to do with multiple gets. You can google for details..

There are some fixes, in order of preference:

1) Use Squid 2.6-STABLE10, it has chunked support. Than HAVP will never see it.

2) Let HAVP strip Accept-Encoding header from clients, but that will result in bandwidth wasted (servers don't send gzip compressed data then). I guess it could be made into config option. But using Squid is a must in my opinion anyhow.

Cheers,
Henrik


Top
 Profile  
 
PostPosted: 16 Mar 2007 01:08 
Offline

Joined: 03 Jun 2006 01:51
Posts: 4
Location: Massachusetts, US
hege wrote:
Chunked is really just a different transfer encoding, nothing to do with multiple gets. You can google for details..

I read section 3.6.1 of RFC 2616 again: it appears that "chunked" simply allows an HTTP response to have multiple segments with their own length fields, making it easier to assemble separate streams into a single Web page/document. This would probably be easy to add to HAVP, but I wouldn't much like forking my own branch of the HAVP source to do so.

hege wrote:
There are some fixes, in order of preference:

1) Use Squid 2.6-STABLE10, it has chunked support. Than HAVP will never see it.

I thought of using Squid, but decided against it, as my SOHO LAN is too small to really need caching, and the fewer services to secure and support, the better.

hege wrote:
2) Let HAVP strip Accept-Encoding header from clients, but that will result in bandwidth wasted (servers don't send gzip compressed data then). I guess it could be made into config option. But using Squid is a must in my opinion anyhow.

Cheers,
Henrik

I already use Privoxy between the browser/clients and HAVP, and Privoxy strips the "Accept-Encoding" headers (since it wants to filter the data, and isn't itself willing to do any gunzipping). Unfortunately, that doesn't seem to dissuade the zdnet.com blog server from using chunking.

_________________
Paul Kosinski


Top
 Profile  
 
PostPosted: 16 Mar 2007 08:34 
Offline
HAVP Developer

Joined: 27 Feb 2006 18:12
Posts: 687
Location: Finland
pk wrote:
I read section 3.6.1 of RFC 2616 again: it appears that "chunked" simply allows an HTTP response to have multiple segments with their own length fields, making it easier to assemble separate streams into a single Web page/document. This would probably be easy to add to HAVP, but I wouldn't much like forking my own branch of the HAVP source to do so.


There are no "separate" streams. It's the single page/file sent in sized chunks. If Content-Length is not known, transport would be unreliable otherwise.

Anyways, there's no need to make your own branch. If you code it, send a patch. At the moment I have other things to do than add a semi-complex HTTP/1.1 feature because of some very few broken sites. :)

Why not just add it in browsers no-proxy then..


Top
 Profile  
 
PostPosted: 16 Mar 2007 19:32 
Offline

Joined: 03 Jun 2006 01:51
Posts: 4
Location: Massachusetts, US
hege wrote:
There are no "separate" streams. It's the single page/file sent in sized chunks. If Content-Length is not known, transport would be unreliable otherwise.

What I meant by "separate streams" is concatenating two or more dynamic data sources (e.g., servlet output) into one web page. If you couldn't use "chunked", then you'd have to buffer the sources in order to strip their individual "Content-Length" headers, add the lengths together and generate a single "Content-Length" header with the total length. That uses a lot of memory and adds latency. Or else it gets much more complicated if you use something like multiple simultaneous TCP connections to the data sources in order to read their "Content-Length" headers before their bodies (and that still adds some latency, since you have to wait for all the "Content-Length" headers).

hege wrote:
Anyways, there's no need to make your own branch. If you code it, send a patch. At the moment I have other things to do than add a semi-complex HTTP/1.1 feature because of some very few broken sites. :)

Which methods should I modify to do it most cleanly -- I don't understand HAVP's class structure nearly as well as you do. (It *is* annoying that dealing with the Internet always eventually requires dealing with software that doesn't follow standards.)

hege wrote:
Why not just add it in browsers no-proxy then..

If the site weren't proxyed, then it wouldn't have anti-virus/anti-phish protection. (The only sites I have bypassing HAVP are anti-virus download sites that expose EICAR etc.)

_________________
Paul Kosinski


Top
 Profile  
 
 Post subject:
PostPosted: 16 Mar 2007 22:54 
Offline
HAVP Developer

Joined: 27 Feb 2006 18:12
Posts: 687
Location: Finland
Actually now that I looked a bit into it and hacked up some quick ugly code, it seems to work. :) Next version will include very experimental support.. I'll give a test version link soon.


Top
 Profile  
 
 Post subject:
PostPosted: 17 Mar 2007 18:13 
Offline
HAVP Developer

Joined: 27 Feb 2006 18:12
Posts: 687
Location: Finland
Try this, problem seems to be fixed. Didn't have many sites to test though..

http://havp.hege.li/download/havp-0.86pre.tar.gz

Cheers,
Henrik


Top
 Profile  
 
 Post subject:
PostPosted: 14 Apr 2007 15:40 
Offline

Joined: 14 Apr 2007 15:36
Posts: 3
how do you make havp operate in http 1.1 mode? Would that work around this issue?


Top
 Profile  
 
 Post subject:
PostPosted: 14 Apr 2007 16:57 
Offline
HAVP Developer

Joined: 27 Feb 2006 18:12
Posts: 687
Location: Finland
hescominsoon wrote:
how do you make havp operate in http 1.1 mode? Would that work around this issue?


The issue is already work around. There is no benefit from full HTTP 1.1 that I can think of right now.


Top
 Profile  
 
 Post subject:
PostPosted: 15 Apr 2007 14:49 
Offline

Joined: 14 Apr 2007 15:36
Posts: 3
hege wrote:
hescominsoon wrote:
how do you make havp operate in http 1.1 mode? Would that work around this issue?


The issue is already work around. There is no benefit from full HTTP 1.1 that I can think of right now.

http v1.1 is the defacto standard across the web. The pipelining abilities and other features are good to have.


Top
 Profile  
 
 Post subject:
PostPosted: 15 Apr 2007 15:00 
Offline
HAVP Developer

Joined: 27 Feb 2006 18:12
Posts: 687
Location: Finland
hescominsoon wrote:
hege wrote:
hescominsoon wrote:
how do you make havp operate in http 1.1 mode? Would that work around this issue?


The issue is already work around. There is no benefit from full HTTP 1.1 that I can think of right now.

http v1.1 is the defacto standard across the web. The pipelining abilities and other features are good to have.


It being "defacto standard" doesn't mean anything, HAVP doesn't benefit from it. In my own unofficial opinion, HAVP is just a Squid filter, not a standalone application (it does work for home users like that well enough). Thus it has no benefit from HTTP 1.1, since Squid handles all that.


Top
 Profile  
 
 Post subject:
PostPosted: 15 Apr 2007 15:05 
Offline

Joined: 14 Apr 2007 15:36
Posts: 3
hege wrote:
hescominsoon wrote:
hege wrote:
hescominsoon wrote:
how do you make havp operate in http 1.1 mode? Would that work around this issue?


The issue is already work around. There is no benefit from full HTTP 1.1 that I can think of right now.

http v1.1 is the defacto standard across the web. The pipelining abilities and other features are good to have.


It being "defacto standard" doesn't mean anything, HAVP doesn't benefit from it. In my own unofficial opinion, HAVP is just a Squid filter, not a standalone application (it does work for home users like that well enough). Thus it has no benefit from HTTP 1.1, since Squid handles all that.

if squid is in 1.1 and havp is in 1.0 doesn't that make squid have to jump down to 1.0 then?


Top
 Profile  
 
 Post subject:
PostPosted: 15 Apr 2007 15:37 
Offline
HAVP Developer

Joined: 27 Feb 2006 18:12
Posts: 687
Location: Finland
hescominsoon wrote:
if squid is in 1.1 and havp is in 1.0 doesn't that make squid have to jump down to 1.0 then?


That's probably true until Squid is fully 1.1 compliant. Why don't you go ask them, why pipelining is not fully supported.. ;)

Then there is something like Polipo, which can upgrade 1.0 to 1.1. Haven't tried it though.

edit: I don't have any objections to someone patching HAVP to 1.1, but I really don't have any time to spend for something I don't find that rewarding.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 13 posts ] 

All times are UTC + 2 hours [ DST ]


Who is online

Users browsing this forum: Google [Bot] and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group