HTTP Anti-Virus Proxy

Official HAVP Support Forum
Registration disabled, I'm tired of spambots. E-mail havp@hege.li if you have questions.
HAVP project is pretty much frozen/abandoned at this time anyway.
It is currently 22 Jun 2014 09:52

All times are UTC + 2 hours [ DST ]




Post new topic Reply to topic  [ 8 posts ] 
Author Message
PostPosted: 25 May 2007 05:00 
Offline

Joined: 25 May 2007 04:19
Posts: 8
Hi folks!

First, great work with HAVP! Thank you for all the hard work.

I'm attempting a Squid Sandwich. I'm using Squid 2.6STABLE6, HAVP 0.86, and ClamAV 0.90.2. The latter two are compiled from source, the former is included with my distro (CentOS 5).

HAVP is on port 8080. Squid is on ports 3128 (for clients) and on port 8081 (for HAVP requests).

If I connect my browser (on a seperate machine) to HAVP on port 8080, things get scanned by HAVP (as per HAVP's access.log) and cached as I expect (watching squid's access log for "HIT" messages). If I connect directly to Squid's external port 8081, I skip HAVP and my requests get cached as expected (this time showing the client's IP in squid's access log instead of 127.0.0.1). So far so good.

The problem I'm having is when I connect to Squid at port 3128. The requested page, if all ready in cache, is scanned with HAVP and Squid shows the page as a cache HIT, as it should be. But the page's object are then ejected from the cache (RELEASE in squid's store.log). When I clear my browser's cache and reload the same page, the entire page's objects shows up as MISS, and are pulled directly from the source.

I'm really not sure why it's doing this. Does anyone have some ideas, or could you point me to where I might look?

Thanks in advance for the help!
-Joe Rhodes


Here are my config files:

Squid.conf:

# WELCOME TO SQUID 2.6.STABLE6
# ----------------------------

# Squid normally listens to port 3128
http_port 3128

# Squid2 for HAVP
http_port 8081


# OPTIONS WHICH AFFECT THE CACHE SIZE
# -----------------------------------------------------------------------------

cache_mem 128 MB
maximum_object_size 300 MB
maximum_object_size_in_memory 64 KB


# LOGFILE PATHNAMES AND CACHE DIRECTORIES
# -----------------------------------------------------------------------------

cache_dir diskd /var/spool/squid 10000 256 256

access_log /var/log/squid/access.log squid

#Default:
# debug_options ALL,1

# OPTIONS WHICH AFFECT THE NEIGHBOR SELECTION ALGORITHM
# -----------------------------------------------------------------------------


#HAVP on localhost port 8080
cache_peer 127.0.0.1 parent 8080 0 no-query no-digest no-netdb-exchange default

#Needed if we want to go directly to Squid2 (external) without HAVP
cache_peer 127.0.0.2 parent 8081 0 no-query no-digest no-netdb-exchange



acl QUERY urlpath_regex cgi-bin \?
cache deny QUERY

acl apache rep_header Server ^Apache
broken_vary_encoding allow apache



# ACCESS CONTROLS
# -----------------------------------------------------------------------------

#Recommended minimum configuration:
acl all src 0.0.0.0/0.0.0.0
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl to_localhost dst 127.0.0.0/8
acl SSL_ports port 443
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 # https
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl CONNECT method CONNECT


# HTTPS traffice scanning not needed
acl Proto_HTTPS proto HTTPS
cache_peer_access 127.0.0.1 allow !Proto_HTTPS
cache_peer_access 127.0.0.1 deny all
cache_peer_access 127.0.0.2 allow all


# Only allow cachemgr access from localhost
http_access allow manager localhost
http_access deny manager
# Deny requests to unknown ports
http_access deny !Safe_ports
# Deny CONNECT to other than SSL ports
http_access deny CONNECT !SSL_ports

#http_access deny to_localhost

# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS

acl local_network src 192.168.1.0/24
http_access allow local_network

acl localhosts src 127.0.0.0/24
http_access allow localhosts

# And finally deny all other access to this proxy
http_access allow localhost
http_access deny all

http_reply_access allow all

icp_access allow all


# MISCELLANEOUS
# -----------------------------------------------------------------------------

# We only want to cache requests to Squid2 (external)
acl HAVP_PORT myport 8081
no_cache deny !HAVP_PORT
#Always use Squid2 (external) or HAVP
prefer_direct off
always_direct allow HAVP_PORT
never_direct allow all

# OPTIONS FOR TUNING THE CACHE
# -----------------------------------------------------------------------------

refresh_pattern windowsupdate.com/.*\.(cab|exe) 4320 100% 43200 reload-into-ims
refresh_pattern download.microsoft.com/.*\.(cab|exe) 4320 100% 43200 reload-into-ims
refresh_pattern ^http://.*\.cnn\.com 60 50% 4320 override-lastmod
refresh_pattern ^http://news\.bbc\.co\.uk 60 50% 4320 override-lastmod
refresh_pattern microsoft 60 150% 10080 override-lastmod
refresh_pattern msn\.com 4320 150% 10080 override-lastmod
refresh_pattern ^http://.*\.doubleclick\.net 10080 300% 40320 override-lastmod
refresh_pattern \.r[0-9][0-0]$ 10080 150% 40320
refresh_pattern ^http://.*\.gif$ 1440 50% 20160
refresh_pattern ^http://.*\.asis$ 1440 50% 20160
refresh_pattern -i \.pdf$ 10080 90% 43200
refresh_pattern -i \.art$ 10080 150% 43200
refresh_pattern -i \.avi$ 10080 150% 40320
refresh_pattern -i \.mov$ 10080 150% 40320
refresh_pattern -i \.wav$ 10080 150% 40320
refresh_pattern -i \.mp3$ 10080 150% 40320
refresh_pattern -i \.qtm$ 10080 150% 40320
refresh_pattern -i \.mid$ 10080 150% 40320
refresh_pattern -i \.viv$ 10080 150% 40320
refresh_pattern -i \.mpg$ 10080 150% 40320
refresh_pattern -i \.jpg$ 10080 150% 40320 reload-into-ims override-lastmod
refresh_pattern -i \.rar$ 10080 150% 40320
refresh_pattern -i \.ram$ 10080 150% 40320
refresh_pattern -i \.gif$ 10080 300% 40320
refresh_pattern -i \.txt$ 1440 100% 20160 reload-into-ims override-lastmod
refresh_pattern -i \.zip$ 2880 200% 40320
refresh_pattern -i \.arj$ 2880 200% 40320
refresh_pattern -i \.exe$ 2880 200% 40320
refresh_pattern -i \.tgz$ 10080 200% 40320
refresh_pattern -i \.gz$ 10080 200% 40320
refresh_pattern -i \.tgz$ 10080 200% 40320
refresh_pattern -i \.tar$ 10080 200% 40320
refresh_pattern -i \.Z$ 10080 200% 40320
refresh_pattern -i \.dmg$ 10080 200% 40320


#Suggested default:
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern . 0 20% 4320


And my havp.conf file:
______________________________________________________________________

USER clamav
GROUP clamav
SERVERNUMBER 20
LOG_OKS true
SCANTEMPFILE /var/tmp/havp/havp-XXXXXX
TEMPDIR /tmp
DBRELOAD 30
PARENTPROXY localhost
PARENTPORT 8081
X_FORWARDED_FOR true
PORT 8080
STREAMUSERAGENT Player Winamp iTunes QuickTime Audio RMA/ MAD/ XMMS
ENABLECLAMLIB true
CLAMBLOCKENCRYPTED true
ENABLECLAMD false
ENABLEFPROT false
ENABLEAVG false
ENABLEAVESERVER false
ENABLESOPHIE false
ENABLETROPHIE false
ENABLENOD32 false
ENABLEAVAST false
ENABLEARCAVIR false


Top
 Profile  
 
PostPosted: 01 Jun 2007 19:40 
Offline

Joined: 26 May 2007 03:00
Posts: 11
Have you tried disable firewall and selinux as suggested by shajtan?

It work for me when i disable firewall and selinux on centos5 with squid2.6 stable6-3.el4.

Chaq


Top
 Profile  
 
PostPosted: 01 Jun 2007 19:56 
Offline

Joined: 25 May 2007 04:19
Posts: 8
I have neither a firewall up at the moment, nor do I have SE Linux engaged.

Again, the proxy is working, but it's ejecting objects from the cache on the first access.

Cheers!
-Joe


Top
 Profile  
 
 Post subject:
PostPosted: 01 Jun 2007 20:57 
Offline
HAVP Developer

Joined: 27 Feb 2006 18:12
Posts: 687
Location: Finland
Actually I'm not sure if the documentation example is correct..

Can you try removing this line:

no_cache deny !HAVP_PORT

And modify this, adding "proxy-only":

cache_peer 127.0.0.1 parent 8080 0 no-query no-digest no-netdb-exchange proxy-only default

If I'm figuring it out correctly, now anything fetched from HAVP is not saved, but everything that "squid2" gets should be saved.


Top
 Profile  
 
 Post subject:
PostPosted: 04 Jun 2007 20:42 
Offline

Joined: 25 May 2007 04:19
Posts: 8
Sorry for the delay in getting back to this thread.

I've made the changes suggested, and that does indeed keep things from being ejected from the cache after the first access.

However, now I'm in a situation where loading a page a second time won't run all the elements through HAVP. That is, the elements are in the Squid cache and it satisfies them, without asking it's parent cache. The "proxy-only" part doesn't seem to have the desired effect, though it seems like it should.

I'm comparing the Squid access log to the HAVP access log.

I'm a novice when it comes to Squid, but it seems like I'm looking for an ACL that says something like "if the request came from this subnet (my local IP range), always consult the parent, don't look at the local cache".

I suppose the other way around this would be to just have two instances of squid running, with two separate config files. Do you know how much additional overhead that would place on things?

Thanks again for the help!
-Joe


Top
 Profile  
 
 Post subject:
PostPosted: 04 Jun 2007 20:51 
Offline
HAVP Developer

Joined: 27 Feb 2006 18:12
Posts: 687
Location: Finland
What happens if you add?

acl CACHE_PORT myport 3128
cache deny CACHE_PORT

I would imagine cache deny !HAVP_PORT working the same, but Squid logic can be sometimes strange..

You could run the two instances too. It really shouldn't add any overhead, especially when it has very bare configuration. All the requests are processed twice in any case.


Top
 Profile  
 
 Post subject:
PostPosted: 04 Jun 2007 21:04 
Offline

Joined: 25 May 2007 04:19
Posts: 8
Yup, that seems to have the same result as "no_cache deny !HAVP_PORT".

I'm going to try it with two different instances of squid, two configuration files and let you know how that works.

Just out of curiosity, have others got this to work as I'm expecting it too? I'm not sure if I'm doing something wrong, or if this is a just the way Squid is and no one's noticed before.

Cheers!
-Joe Rhodes


Top
 Profile  
 
 Post subject:
PostPosted: 04 Jun 2007 22:24 
Offline

Joined: 25 May 2007 04:19
Posts: 8
So I finally gave up and went with two seperate instances of Squid, which inital testing seems to be doing exactly what I'd like: Cache data before HAVP, have all data going to clients get scanned each time, regardless of whether it was cached or not. Below are my two config files for Squid.

There are a couple of things:
1. This does split up the logs, (or gives you the option) so you have one log for client access logs, which should all be MISS but have correct internal IP's. The other will only have the loopback address, but will have accurate HIT/MISS entries. This will make using analysis tools such as Calamaris easier. (Note, I've choosen to turn off the client access logs. I'm not interested in tracing what a particular machine/user is doing.)

2. You'll have to manually start the second Squid (I call it squid-outer) with the alternate config file. I use:

squid -D -f /etc/squid/squid-outer.conf

You'll also have to stop it.

squid -f /etc/squid/squid-outer.conf -k shutdown

I'm sure there's a way to integrate that with the rc script, but I'm just not that motivated.

3. You'll have to modify your logrotate scripts

4. You'll want to disect the ACL's I've listed below. They're almost certainly looser than they need to be. This was just my first stab at things.

5. Adjust caching rules in the "outer" config file. Adjust client restrictions/authentications in the "inner".

Enjoy!
-Joe Rhodes


# WELCOME TO SQUID 2.6.STABLE6
# ----------------------------

# This is Squid-inner. It listesns for request from clients on port 3128, forwards
# those requests to HAVP, and doesn't cache the results. HAVP forwards request
# to a caching instance of Squid. This keeps you from serving potentionally
# infected cache files to clients. This is also where you would make any client
# ACL rules.
http_port 3128

# LOGFILE PATHNAMES AND CACHE DIRECTORIES
# -----------------------------------------------------------------------------
cache_dir diskd /var/spool/squid-inner 1000 16 16
#access_log /var/log/squid/access-inner.log squid
# Turn off client logging
access_log none

# OPTIONS WHICH AFFECT THE NEIGHBOR SELECTION ALGORITHM
# -----------------------------------------------------------------------------

#HAVP on localhost port 8080
cache_peer 127.0.0.1 parent 8080 0 no-query no-digest no-netdb-exchange proxy-only default


# ACCESS CONTROLS
# -----------------------------------------------------------------------------

#Recommended minimum configuration:
acl all src 0.0.0.0/0.0.0.0
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl to_localhost dst 127.0.0.0/8


acl SSL_ports port 443
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 # https
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl CONNECT method CONNECT


# Only allow cachemgr access from localhost
http_access allow manager localhost
http_access deny manager
# Deny requests to unknown ports
http_access deny !Safe_ports
# Deny CONNECT to other than SSL ports
http_access deny CONNECT !SSL_ports

#http_access deny to_localhost

# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS

acl local_network src 192.168.1.0/24
http_access allow local_network

# And finally deny all other access to this proxy
http_access allow localhost
http_access deny all
http_reply_access allow all



# MISCELLANEOUS
# -----------------------------------------------------------------------------

#Always use Squid2 (external) or HAVP
prefer_direct off

# Send HTTPS request out directly, don't send through HAVP
always_direct allow SSL_ports

# Send all other requests through HAVP
never_direct allow all

# Don't cache any results
cache deny all

#Suggested default:
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern . 0 20% 4320





# WELCOME TO SQUID 2.6.STABLE6
# ----------------------------

# This is Squid-outer for HAVP. It listens for requests from the HAVP daemon and
# caches the results. (Squid1 listes from clients, uses HAVP as a parent, and
# does NOT cache results.)

http_port 8081


# OPTIONS WHICH AFFECT THE CACHE SIZE
# -----------------------------------------------------------------------------

cache_mem 128 MB
maximum_object_size 300 MB
maximum_object_size_in_memory 64 KB


# LOGFILE PATHNAMES AND CACHE DIRECTORIES
# -----------------------------------------------------------------------------
cache_dir diskd /var/spool/squid-outer 10000 256 256
access_log /var/log/squid/access-outer.log squid
#Default:
# debug_options ALL,1
pid_filename /var/run/squid-outer.pid


# OPTIONS WHICH AFFECT THE NEIGHBOR SELECTION ALGORITHM
# -----------------------------------------------------------------------------

acl QUERY urlpath_regex cgi-bin \?
cache deny QUERY

acl apache rep_header Server ^Apache
broken_vary_encoding allow apache



# ACCESS CONTROLS
# -----------------------------------------------------------------------------

#Recommended minimum configuration:
acl all src 0.0.0.0/0.0.0.0
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl to_localhost dst 127.0.0.0/8


acl SSL_ports port 443
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 # https
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl CONNECT method CONNECT


# HTTPS traffic scanning not needed
#acl Proto_HTTPS proto HTTPS
#cache_peer_access 127.0.0.1 allow !Proto_HTTPS
#cache_peer_access 127.0.0.1 deny all
#cache_peer_access 127.0.0.2 allow all


# Only allow cachemgr access from localhost
http_access allow manager localhost
http_access deny manager
# Deny requests to unknown ports
http_access deny !Safe_ports
# Deny CONNECT to other than SSL ports
http_access deny CONNECT !SSL_ports

#http_access deny to_localhost

# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS

acl local_network src 192.168.1.0/24
http_access allow local_network

acl localhosts src 127.0.0.0/24
http_access allow localhosts

# And finally deny all other access to this proxy
http_access allow localhost
http_access deny all

http_reply_access allow all

icp_access allow all


# OPTIONS FOR TUNING THE CACHE
# -----------------------------------------------------------------------------

refresh_pattern ^http://.*\.apple\.com 3600 200% 43200 reload-into-ims
refresh_pattern windowsupdate.com/.*\.(cab|exe) 4320 100% 43200 reload-into-ims
refresh_pattern download.microsoft.com/.*\.(cab|exe) 4320 100% 43200 reload-into-ims
refresh_pattern ^http://.*\.cnn\.com 60 50% 4320 override-lastmod
refresh_pattern ^http://news\.bbc\.co\.uk 60 50% 4320 override-lastmod
refresh_pattern microsoft 60 150% 10080 override-lastmod
refresh_pattern msn\.com 4320 150% 10080 override-lastmod
refresh_pattern ^http://.*\.doubleclick\.net 10080 300% 40320 override-lastmod
refresh_pattern ^http://.*FIDO 360 1000% 480
refresh_pattern \.r[0-9][0-0]$ 10080 150% 40320
refresh_pattern ^http://.*\.gif$ 1440 50% 20160
refresh_pattern ^http://.*\.asis$ 1440 50% 20160
refresh_pattern -i \.pdf$ 10080 90% 43200
refresh_pattern -i \.art$ 10080 150% 43200
refresh_pattern -i \.avi$ 10080 150% 40320
refresh_pattern -i \.mov$ 10080 150% 40320
refresh_pattern -i \.wav$ 10080 150% 40320
refresh_pattern -i \.mp3$ 10080 150% 40320
refresh_pattern -i \.qtm$ 10080 150% 40320
refresh_pattern -i \.mid$ 10080 150% 40320
refresh_pattern -i \.viv$ 10080 150% 40320
refresh_pattern -i \.mpg$ 10080 150% 40320
refresh_pattern -i \.jpg$ 10080 150% 40320 reload-into-ims override-lastmod
refresh_pattern -i \.rar$ 10080 150% 40320
refresh_pattern -i \.ram$ 10080 150% 40320
refresh_pattern -i \.gif$ 10080 300% 40320
refresh_pattern -i \.txt$ 1440 100% 20160 reload-into-ims override-lastmod
refresh_pattern -i \.zip$ 2880 200% 40320
refresh_pattern -i \.arj$ 2880 200% 40320
refresh_pattern -i \.exe$ 2880 200% 40320
refresh_pattern -i \.tgz$ 10080 200% 40320
refresh_pattern -i \.gz$ 10080 200% 40320
refresh_pattern -i \.tgz$ 10080 200% 40320
refresh_pattern -i \.tar$ 10080 200% 40320
refresh_pattern -i \.Z$ 10080 200% 40320
refresh_pattern -i \.dmg$ 10080 200% 40320


#Suggested default:
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern . 0 20% 4320


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC + 2 hours [ DST ]


Who is online

Users browsing this forum: Google [Bot] and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group