Preventing Sensitive Information Inadvertently Being Published You probably hear many horror stories about many not-well-managed servers putting up sensitive informations unintentionaly on the web. Furthermore many people are using Google to find these unintended documents. See Google Hacks at http://johnny.ihackstuff.com/. To prevent this from happening at your site, you may want to apply the counter-measures described here. 1) Exclude any requests to Office files You can do this with the nimda filter. *.xls (Excel files) *.mdb (Access files) # all filename matching are done in case-insensitive manner proxy_nimda_enable = 1 proxy_nimda_path = *.xls proxy_nimda_path = *.mdb 2) Using the regex content rewrite filter, you can filter out some information. The following are some examples I have written. Of course these are far from perfect, in that they don't consider any context. But at least you can turn away some "searchers" that look for these patterns. You can apply these filter rules to HTML and text files, but you can also apply them to other binary or structured files, like .doc and PDF, if you take the risk of sending corrupted documents to users. So you may want to restrict URL space that you apply these filters to. Masked Patterns credit card numbers ------------------- dddd-dddd-dddd-dddd dddddddddddddddd (this is too broad, commented out) dddd-dddddd-ddddd ddddddddddddddd (this is too broad, commented out) social security numbers ----------------------- ddd-dd-dddd ddddddddd (this is too broad, commented out) phone numbers (adjust to your local convention) ----------------------------------------------- ddd-ddd-dddd (ddd)ddd-ddd dddddddddd (this is too broad, commented out) email addresses --------------- aaa@bbb.ccc -> xxx@xxx.xxx or, publish but protect from collectors aaa@bbb.ccc -> aaa at nospam bbb.ccc --- parameters to add to sproxy.conf # all text files proxy_filter_define = mask-info mod_filt_rwt rwtype=regex mtype="text/" # apply to /public space (sorry, extended-url pattern cannot be used yet) proxy_filter_assign = /public/* mask-info --- end --- --- Sample rewrite_regex.conf : Cut Here --- ==== credit card numbers === = dddd-dddd-dddd-dddd (\D)\d{4}-\d{4}-\d{4}-\d{4}(\D)=$1dddd-dddd-dddd-dddd$2 = dddddddddddddddd =(\D)\d{16}(\D)=$1dddddddddddddddd$2 = dddd-dddddd-ddddd (\D)\d{3}-\d{6}-\d{5}(\D)=$1dddd-dddddd-ddddd$2 = ddddddddddddddd =(\D)\d{15}(\D)=$1ddddddddddddddd$2 === social security numbers === =ddd-dd-dddd (\D)\d{3}-\d{2}-\d{4}(\D)=$1ddd-dd-dddd$2 =ddddddddd =(\D)\d{9}(\D)=$1ddddddddd$2 === phone numbers (adjust to your local convention) === =ddd-ddd-dddd (\D)\d{3}-\d{3}-\d{4}(\D)=$1ddd-ddd-dddd$2 =(ddd)ddd-dddd (\D)\(\d{3}\)\d{3}-\d{4}(\D)=$1(ddd)ddd-dddd$2 =dddddddddd =(\D)\d{10}(\D)=$1dddddddddd$2 === email addresses === = aaa@bbb.com -> aaa at nospam bbb.com ([a-zA-Z_\-]{2,32})@([a-zA-Z_\-\.]{3,64})=$1 at nospam $2 --- Cut Here --- EOF