CensorNet Professional v2.0.x - Important release notes

Published in CensorNet Professional on November 09, 2010 by Administrator

Changes to the URL classification system and policies that you need to be aware of The pending v2.0.x release brings with it some fundamental changes to the way that CensorNet classifies web sites and as a result this has a direct impact on the categories available in your filter policies. The good news is that the new system is faster, more accurate and broader - with an additional 30 categories being added to the database. Furthermore, it also includes malware protection as standard whereas previously this was an optional extra which you had to pay for.

If you are not really interested in how the new classification works then please read the Important notes at the end grin

Let's cover off how CensorNet used to categorise sites prior to this new release. CensorNet relied on a large database, fondly known as the CSRV (Category Server) which you downloaded to the server after installation. Typically, the CSRV would take up around 12GB of space, after a 3.5GB download, and browsing speed would be a bit lumpy until the download had completed. In the database, there was approximately 70,000,000 domains, updated once per day. When users browsed the web, the CensorNet server would match the requested domains in the database and allow or deny them based on your policy rules. If it could not find a match in the database, it would attempt to classify the site in real-time using raters (intelligent bits of code) on your local CensorNet server. This methodology has served us well for 3 years, however with the ever changing web and the constant wave of new threats there are obvious flaws in this approach.

 

Firstly, the local database was only updated once per day. If, for example, there was an outbreak of a virus that infected a well known web site, it would be 24hrs before the CensorNet server was up to date, and another 24 hours before the infected site was cleared. Not ideal. You do have the option to add your own custom URL entry, but there is always the chance you might be too late. You also do not have the benefit of shared knowledge - with the new system all our customers servers contribute to finding new web sites and having them classified. It's a constantly evolving system where newly identified URL's are passed through the cloud, analysed and then categorised. This behind the scenes collaboration benefits everyone. And finally, as part of the lookups are made in the cloud it is updated far more frequently. As much as every 20 minutes*.

 

The CSRV has been replaced by a smaller database which is approximately 500Mb in size. This database is entirely optional - you do not need it to use CensorNet effectively - however it serves as a local cache of the most frequently accessed domains and their categories, so it is worthwhile having to help your CensorNet server perform at its best. Any URL lookups that haven't already been stored in the cache (if it exists) are sent out over the DNS system to our cloud servers where a response is immediately given as to their categorisation. So those are the key reasons why we are moving away from a purely local URL database and a bit of information about the new system.

 

 

Let's have a look at the new web site classification life cycle in version 2.0: URL filtering life cycle

 

On average we're processing hundreds of millions of URLs on a daily basis. These URL's all go through the same process, which starts off by checking each URL for malicious content so that if the site is infected it is blocked before there is any risk of a user downloading the content. Secondly, the URL goes on for auto-categorisation. This is a process whereby the system tries to determine the content of the site using raters (those clever pieces of code that used to live locally on your CensorNet server are now living in the cloud) to place the URL immediately into an appropriate category. If the raters cannot do this, or they are unsure if the result is accurate, the URL goes on for human classification. It is not just new URL's that go on for human classification, every URL in the database is revisited by the raters and then a human after it has been in the database for a certain amount of time. This ensures the database is always fresh and accurate. These changes are then made available immediately in the cloud for any future lookups by your CensorNet servers. You can also communicate reclassification requests to the cloud straight from the CensorNet web control panel. In the Unblock Request screen there is now a Reclassify button, which you can use to tell us if you think a site is in the wrong category. These requests will go for human verification and then publish to the cloud when the change has been actioned.

 

1st important note about policies As part of the development of the new service we have overhauled the old category list; removed old categories, merged similar categories and added new ones. As a result, when you upgrade to v2.0.x the upgrade process will automatically map your old category selections to the new ones. It is very important that you review all your policies after upgrading to make sure that any new categories are set correctly, as these will not be allowed or blocked by the upgrade process.

 

2nd important note about policies In the policy settings page there is a new option called "Uncategorised sites" which by default is set to "Block". This means that CensorNet will block access to any URL that is not classified in the database (local or cloud). It will automatically send that URL for classification, so eventually it will be placed in an appropriate category and then allowed or blocked according to your category selection, however initially it will be blocked. This option is set to "block" to satisfy the requirements of BECTA accreditation however some customers, particularly non-schools, may find they want to set this feature to "allow" instead. You will need to do this in all of your policies.

 

* except for sites contained in your local cache, which can take up to 24 hours to expire



Last modified on Wed, August 10, 2011 « Back