Quick start: ICAP server. Minimum system requirements for ICAP server

Module start page

A service that allows clients to make indirect requests to other network services. First, the client connects to the proxy server and requests some web resource located on another server. The proxy server then either connects to the specified server and obtains the resource from it, or returns the resource from its own cache (if one of the clients has already accessed this resource). In some cases, the client request or server response may be modified by the proxy server for certain purposes.

Also, the proxy server allows you to analyze client HTTP requests passing through the server, filter and record traffic by URL and mime types. In addition, the proxy server implements a mechanism for accessing the Internet using a login/password.

The proxy server caches objects received by users from the Internet and thereby reduces traffic consumption and increases page loading speed.

When entering the module, the status of services, the “Disable” button (or “Enable” if the module is disabled) and the latest messages in the log are displayed.

Settings

Usually, to work through a proxy server, you need to specify its address and port in the browser settings. However, if user authorization by login/password is not used, then you can use the transparent proxy function.

In this case, all HTTP requests from the local network are automatically routed through the proxy server. This makes it possible to filter and record traffic by URL, regardless of the settings of client computers.

The default proxy server port is 3128, in the module settings you can change it to any free port.

Authorization types

The ICS proxy server supports two methods of authorization: by user’s IP address, and by login-password.

Authorization by IP address is suitable for cases where the user constantly uses the same computer. The proxy determines which user owns this or that traffic based on the IP address of his computer. This method is not suitable for terminal servers, since in this case several users work from the same IP address. Also, this method is not suitable for organizations in which users constantly move between workstations. In addition, the user can change the IP address of his computer and, if the MAC address is not configured to bind to the IP, the ICS will mistake him for someone else.

Authorization by login/password solves the problem of linking users to own computer. In this case, the first time you access any Internet resource, the browser will prompt the user for a login/password to access the Internet. If users on your network are authorized in a domain, you can set the authorization type to “Via Domain”. In this case, if the ICS is connected to a domain controller and users were imported from the domain, authorization will be performed transparently, without requiring a login/password.

In addition, you should remember that proxy authorization is only used for user http traffic. Internet access for programs using protocols other than http is regulated by a firewall, which has only one authorization method: by IP address. In other words, if a user uses only login/password authorization, he will not be able to use mail, jabber client, torrent client and other programs that do not support working through an http proxy.

Web login

In order to authorize users without a registered proxy server using a username and password, you can use web authorization (captive portal) by enabling the appropriate checkbox. Web authorization allows, for example, to integrate an authorization page into a corporate portal and use it as an authorization page. By default, the web authorization port is 82, you can also change it to any free one.

In order not to manually register a proxy server on each client machine, you can use the auto-configurator. The “Automatic proxy configuration” option must be set in the client’s browser; all other settings will be determined by the ICS.

It is enabled by checking the box in the corresponding tab. You can select one or more protocols from the available ones (HTTP, HTTPS, FTP).

The option to publish an auto-configuration script determines whether it will be accessible by the server’s IP address or by a created virtual host with domain name. When you select a virtual host, it will automatically be created in the system. Checkbox “Create an entry on the DNS server” will automatically add a zone with the necessary records for this virtual host.

Publish auto-configuration script via DHCP- this parameter transmits proxy settings to all DHCP clients of the server.

Parent proxy

If your organization has several proxy servers located hierarchically, then the proxy server superior to the ICS will be its parent proxy. In addition, any network node can act as a parent proxy.

In order for the ICS to redirect requests coming to its proxy server to the parent proxy, specify its IP address and destination port in the “Parent Proxy” tab.

Proxy servers can exchange their cache data using the ICP protocol. If the network operates through several proxies, this can significantly speed up the work. If the parent proxy supports the protocol, check the appropriate box and specify the service port (default 3130).

Issued IP addresses

This tab contains a list of IP addresses and users who have logged into the proxy server using web authorization.

Cache Contents

The “Log” tab contains a summary of all system messages from the proxy server. The magazine is divided into pages; using the “forward” and “back” buttons you can move from page to page, or enter the page number in the field and switch directly to it.

Log entries are highlighted in color depending on the type of message. Regular system messages are marked in white, system status messages (on/off, cache processing) are in green, errors are in red.

On the right top corner module there is a search line. With its help, you can search the journal for the entries you need.

The log always displays events for the current date. To view events on a different day, select the desired date using the calendar in the upper left corner of the module.

ICRA (Internet Content Rating Association) is a new initiative developed by an independent non-profit organization with the same name. The main goal of this initiative is to protect children from accessing prohibited content. This organization has agreements with many companies (major telecommunications and software companies) to provide more reliable protection.

ICRA provides software that allows you to check the special label returned by a site and decide whether to access that data. The software runs only on the Microsoft Windows platform, but thanks to the open specification, it is possible to create filtering software implementations for other platforms.

The goals and objectives solved by this organization, as well as all Required documents can be found on the ICRA website - http://www.icra.org/.

The advantages of this approach include the fact that only special software is needed to process the data and there is no need to update the address and/or category databases, since all information is transmitted by the site itself. But the disadvantage is that the site may indicate the wrong category, which will lead to incorrect provision or denial of access to data. However, this problem can be solved (and is already being solved) through the use of data verification tools, such as digital signatures, etc.

Traffic filtering in the world of Web 2.0

The massive introduction of so-called Web 2.0 technologies has greatly complicated content filtering of web traffic. Because in many cases the data is transferred separately from the design, there is the possibility of undesired information being leaked to or from the user. When working with sites that use such technologies, you must do comprehensive analysis transmitted data, determining the transmission of additional information and taking into account the data collected in previous stages.

Currently, none of the companies producing tools for content filtering of web traffic allows for comprehensive analysis of data transmitted using AJAX technologies.

Integration with external systems

In many cases, the issue of integrating content analysis systems with other systems becomes quite pressing. In this case, content analysis systems can act as both clients and servers or in both roles at once. For these purposes, several standard protocols have been developed - Internet Content Adaptation Protocol (ICAP), Open Pluggable Edge Services (OPES). In addition, some manufacturers have created their own protocols to allow specific products to communicate with each other or with third-party software. These include Cisco Web Cache Coordination Protocol (WCCP), Check Point Content Vectoring Protocol (CVP) and others.

Some protocols - ICAP and OPES - are designed so that they can be used to implement both content filtering services and other services - translators, advertising placement, data delivery, depending on the policy of their distribution, etc.

ICAP protocol

Currently, the ICAP protocol is popular among authors of content filtering software and creators of software for identifying malicious content (viruses, spyware/malware). However, it is worth noting that ICAP was primarily designed to work with HTTP, which imposes many restrictions on its use with other protocols.

ICAP has been adopted by the Internet Engineering Task Force (IETF) as a standard. The protocol itself is defined by RFC 3507 with some additions outlined in the ICAP Extensions draft. These documents and Additional Information available from the ICAP Forum server - http://www.i-cap.org.

The system architecture when using the ICAP protocol is shown in. The ICAP client is the system through which traffic is transmitted. The system that performs data analysis and processing is called an ICAP server. ICAP servers can act as clients for other servers, which makes it possible to connect several services to collectively process the same data.

Figure 1. Scheme of interaction between ICAP servers and clients

For interaction between the client and the server, a protocol similar to HTTP version 1.1 is used, and the same methods of encoding information are used. According to the ICAP standard, it can process both outgoing (REQMOD - Request Modification) and incoming (RESPMOD - Response Modification) traffic. The decision about which of the transmitted data will be processed is made by the ICAP client, in some cases this makes it impossible full analysis data. Client settings are entirely implementation dependent and in many cases cannot be changed.

After receiving data from the client, the ICAP server processes it and, if necessary, modifies the data. The data is then returned to the ICAP client, and it passes it on to the server or client, depending on the direction in which it was transferred.

ICAP is most widely used in anti-malware products because it allows these checks to be used across multiple products and is independent of the platform on which the ICAP client is running.

The disadvantages of using ICAP include the following:

additional network interactions between the client and the server somewhat slow down the speed of data transfer between external systems and information consumers;
There are checks that need to be performed not on the client, but on the ICAP server, such as determining the data type, etc. This is relevant because in many cases ICAP clients rely on the file extension or data type reported by the external server, which may cause a security policy violation;
Difficult integration with systems using protocols other than HTTP prevents the use of ICAP for deep data analysis.

OPES protocol

Unlike ICAP, the OPES protocol was developed taking into account the characteristics of specific protocols. In addition, during its development, the shortcomings of the ICAP protocol were taken into account, such as the lack of authentication of clients and servers, lack of authentication, etc.

Like ICAP, OPES has been adopted as a standard by the Internet Engineering Task Force. The structure of service interaction, interaction protocol, service requirements and solutions for ensuring service security are set out in documents RFC 3752, 3835, 3836, 3837 and others. The list is regularly updated with new documents describing the application of OPES not only to the processing of Internet traffic, but also to the processing of mail traffic, and in the future, possibly, other types of protocols.

The structure of interaction between OPES servers and clients (OPES Processor) is shown in. In general terms, it is similar to the scheme of interaction between ICAP servers and clients, but there are significant differences:

there are requirements for the implementation of OPES clients, which makes it possible to more conveniently manage them - setting filtering policies, etc.;
the data consumer (user or information system) can influence the processing of the data. For example, when using automatic translators, the received data can be automatically translated into the language used by the user;
systems providing data can also influence the results of processing;
Processing servers can use for analysis data specific to the protocol through which the data was transmitted to the OPES client;
Some processing servers may obtain more sensitive data if they have a trusted relationship with the OPES client, consumers, and/or information providers.

Figure 2. Scheme of interaction between clients and OPES servers

All of the above capabilities depend solely on the configuration used when implementing the system. Due to these capabilities, using OPES is more promising and convenient than using the ICAP protocol.

Products that support OPES along with the ICAP protocol are expected to appear in the near future. A pioneer in the development and use of OPES is Secure Computing with its Webwasher product line.

Since there are currently no full-fledged implementations using OPES, it is impossible to draw final conclusions about the shortcomings of this approach, although theoretically there remains only one drawback - the increase in processing time due to the interaction between clients and OPES servers.

HTTPS and other types of encrypted traffic

Some analysts estimate that up to 50% of Internet traffic is transmitted in encrypted form. The problem of controlling encrypted traffic is now relevant for many organizations, since users can use encryption to create information leakage channels. In addition, encrypted channels can also be used by malicious code to penetrate computer systems.

There are several tasks associated with processing encrypted traffic:

analysis of data transmitted over encrypted channels;
checking certificates that are used by servers to organize encrypted channels.

The relevance of these tasks is increasing every day.

Encrypted data transmission control

Controlling the transmission of data sent over encrypted channels is probably the most important task for organizations whose employees have access to Internet resources. To implement this control, there is an approach called "Man-in-the-Middle" (also called "Main-in-the Middle" in some sources), which can be used by attackers to intercept data. Data processing scheme for this method given on .

Figure 3. Processing of encrypted data

The data processing process is as follows:

A specially issued certificate is installed in the user’s Internet browser to establish a connection with the proxy server;
when establishing a connection with a proxy server, it uses a known certificate to decrypt the transmitted data;
decrypted data is analyzed in the same way as regular HTTP traffic;
the proxy server establishes a connection with the server to which the data must be transferred and uses the server certificate to encrypt the channel;
The data returned from the server is decrypted, analyzed and transmitted to the user, encrypted with a proxy server certificate.

The following products are currently offered on the market to control the transmission of encrypted data: Webwasher SSL Scanner from Secure Computing, Breach View SSL TM, WebCleaner.

Certificate authentication

The second task that arises when using encrypted data transmission channels is verifying the authenticity of certificates provided by the servers with which users work.

Attackers can attack information systems by creating a false DNS entry that redirects user requests not to the site they need, but to one created by the attackers themselves. With the help of such fake sites, important user data such as credit card numbers, passwords, etc. can be stolen, and malicious code can be downloaded under the guise of software updates.

To prevent such cases, there is specialized software that checks the compliance of the certificates provided by the server with the data they report.

If there is a discrepancy, the system may block access to such sites or provide access after explicit confirmation by the user. In this case, data processing is performed in almost the same way as when analyzing data transmitted over encrypted channels, only in this case it is not the data that is analyzed, but the certificate provided by the server.

Mail traffic filtering

When using email, organizations are faced with the need to provide security for both incoming and outgoing traffic. But the tasks solved for each direction are quite different. Inbound traffic must be controlled for malware, phishing, and spam, while outbound mail must be controlled for content that could lead to leakage. important information, distribution of compromising materials and the like.

Most products on the market only provide control of incoming traffic. This is done through integration with anti-virus systems and the implementation of various protection mechanisms against spam and phishing. Many of these functions are already built into email clients, but they cannot completely solve the problem.

Protection against phishing is most often carried out by comparing received email messages with an existing database of website addresses and messages. Such databases are provided by software suppliers.

There are currently several ways to protect users from spam:

comparison of received messages with the existing message database. When comparing can be used various techniques, including use genetic algorithms, which allow you to isolate keywords even if they are distorted;
dynamic categorization of messages by their content. Allows you to very effectively detect the presence of unwanted correspondence. To counter this method, spam distributors use messages in the form of an image with text inside and/or sets of words from dictionaries, which create noise that interferes with the operation of these systems. However, in the future there may be anti-spam systems that can recognize text inside images and thus determine their content;
grey, white and black access lists allow you to describe the policy for accepting email messages from known or unknown sites. The use of gray lists in many cases helps prevent the transmission of unwanted messages due to the specific operation of the software that sends spam. To maintain access blacklists, both local databases managed by the administrator and global databases, replenished based on user messages from all over the world, can be used. However, the use of global databases risks the fact that entire networks, including those containing “good” mail servers, may fall into them.

To combat information leaks, the most different ways, based on interception and deep analysis of messages in accordance with a complex filtering policy. In this case, there is a need to correctly determine file types, languages and text encodings, and conduct a semantic analysis of transmitted messages.

Another use of mail filtering systems is to create encrypted mail streams, where the system automatically signs or encrypts the message, and the data is automatically decrypted at the other end of the connection. This functionality is very convenient if you want to process all outgoing mail, but it must reach the recipient in encrypted form.

Filtering instant messages

Instant messaging tools are gradually becoming an actively used tool in many companies. They provide quick interaction with employees and/or clients of organizations. Therefore, it is completely natural that the development of tools, which, among other things, can turn out to be a channel for information leakage, has led to the emergence of tools for monitoring transmitted information.

Currently, the most commonly used protocols for instant messaging are MSN (Microsoft Network), AIM (AOL Instant Messaging), Yahoo! Chat, Jabber and their corporate counterparts are Microsoft Live Communication Server (LCS), IBM SameTime and Yahoo Corporate Messaging Server protocols. The ICQ system, which is now owned by AOL and uses almost the same protocol as AIM, has become widespread in the CIS. All of these systems do almost the same thing - transmit messages (both through the server and directly) and files.

Now almost all systems have the ability to make calls from computer to computer and/or to regular phones, which creates certain difficulties for control systems and requires VoIP support to implement full-fledged proxy servers.

Typically, IM traffic control products are implemented as an application gateway that parses the transmitted data and blocks the transmission of prohibited data. However, there are also implementations in the form of specialized IM servers that perform the necessary checks at the server level.

The most popular functions of products for monitoring IM traffic:

access control using individual protocols;
control of clients used, etc.;
access control for individual users:
- allowing the user to communicate only within the company;
- allowing the user to communicate only with certain users outside the company;
control of transmitted texts;
file transfer control. The objects of control are:
- file size;
- file type and/or extension;
data transfer direction;
monitoring the presence of malicious content;
SPIM definition;
saving transmitted data for subsequent analysis.

Currently, the following products allow you to control the transmission of instant messages:

CipherTrust IronIM by Secure Computing. This product supports the AIM, MSN, Yahoo! protocols. Chat, Microsoft LCS and IBM SameTime. This is now one of the most complete solutions;
IM Manager from Symantec (developed by IMLogic, which was acquired by Symantec). This product supports the following protocols - Microsoft LCS, AIM, MSN, IBM SameTime, ICQ and Yahoo! Chat;
Microsoft's Antigen for Instant Messaging also allows you to work with almost all popular instant messaging protocols;
Webwasher Instant Message Filter, from Secure Computing.

Products from other companies (ScanSafe, ContentKeeper) have fewer capabilities than those listed above. It is worth noting that two Russian companies - Grand Prix (SL-ICQ product) and Mera.ru (Sormovich product) - provide products for monitoring the transmission of messages using the ICQ protocol.

VoIP filtering

The growing popularity of means for transmitting audio information between computers (also called Voice over IP (VoIP)) makes it necessary to take measures to control the transfer of such information. There are different implementations for calling from computer to computer and/or to regular phones.

There are standardized protocols for exchanging such information, including Session Instatiation Protocol (SIP), adopted by the IETF, and H.323, developed by the ITU. These protocols are open, which makes them possible to process.

In addition, there are protocols developed by specific companies that do not have open documentation, which makes working with them very difficult. One of the most popular implementations is Skype, which has gained widespread popularity around the world. This system allows you to make calls between computers, make calls to landlines and mobile phones, and receive calls from landlines and mobile phones. IN latest versions The ability to exchange video information is supported.

Most of the currently available products can be divided into two categories:

products that allow you to identify and block VoIP traffic;
products that can identify, capture and analyze VoIP traffic.

Dolphian products that allow you to identify and allow or deny VoIP traffic (SIP and Skype) that is encapsulated in standard HTTP traffic;
Verso Technologies products;
different types of firewalls that have this capability.

product Russian company"Sormovich" supports the capture, analysis and storage of voice information that is transmitted via H.323 and SIP protocols;
the open source library Oreka () allows you to determine the signaling component of audio traffic and capture the transmitted data, which can then be analyzed by other means;
Recently it became known that a product developed by ERA IT Solutions AG allows you to intercept VoIP traffic transmitted using Skype. But to perform such control, you need to install a specialized client on the computer on which Skype is running.

Peer-to-peer filtering

The use of various peer-to-peer (p2p) networks by employees poses the following threats to organizations:

distribution of malicious code;
information leak;
distribution of copyrighted data, which may result in legal action;
decreased labor productivity;

There are a large number of networks operating in the peer-to-peer format. There are networks that have central servers used to coordinate users, and there are networks that are completely decentralized. In the second case, they are especially difficult to control using standard tools such as firewalls.

To solve this problem, many companies are creating products that allow them to detect and process p2p traffic. The following solutions exist for processing p2p traffic:

SurfControl Instant Messaging Filter, which handles p2p as well as instant messaging;
the Websense Enterprise package also provides users with tools to control p2p traffic;
Webwasher Instant Message Filter allows you to control access to various p2p networks.

The use of these or other products not listed here dramatically reduces the risks associated with user access to p2p networks.

Unified Threat Management

Solutions that comply with the Unified Threat Management concept are offered by many security vendors. As a rule, they are built on the basis of firewalls, which, in addition to the main functions, also perform content filtering functions. Typically, these features focus on preventing intrusions, malicious code, and unwanted messages.

Many of these products are implemented in the form of hardware and software solutions that cannot completely replace solutions for filtering email and Internet traffic, since they work only with a limited number of capabilities provided by specific protocols. They are typically used to avoid duplication of functionality across different products and to ensure that all application protocols are processed according to the same known threat database.

The most popular solutions of the Unified Threat Management concept are the following products:

SonicWall Gateway Anti-Virus, Anti-Spyware and Intrusion Prevention Service provides anti-virus and other protection for data transmitted via SMTP, POP3, IMAP, HTTP, FTP, NetBIOS, Instant Messaging protocols and many streaming protocols used to transmit audio and video information ;
a series of ISS Proventia Network Multi-Function Security devices, designed as software and hardware systems, block malicious code, unwanted messages and intrusions. Included in delivery big number checks (including for VoIP), which can be expanded by the user;
Secure Computing's Network Gateway Security hardware platform, in addition to protecting against malicious code and unwanted messages, also has VPN support. This platform combines almost all Secure Computing solutions.

There are other products, but the ones listed above are widely available.

Data interception

Lawful interception has almost always been used by intelligence agencies to collect and analyze transmitted information. However, recently the issue of data interception (not only Internet traffic, but also telephony and other types) has become very relevant in the light of the fight against terrorism. Even those states that have always been against such systems began to use them to control the transfer of information.

Since various types of data are intercepted, often transmitted over high-speed channels, the implementation of such systems requires specialized software for capturing and parsing data and separate software for analyzing the collected data. As such, software for content filtering of a particular protocol can be used.

Perhaps the most famous of these systems is the Anglo-American Echelon system, which has long been used to intercept data in the interests of various departments in the United States and England.

Among Russian products We can mention solutions from the Sormovich company, which allow you to capture and analyze mail, audio and Internet traffic.

Products of the company "Jet Infosystems"

Jet Infosystems has been operating in the content filtering market for more than six years. It started with a system for filtering email traffic, then developments moved to other areas of application of content filtering, and the company is not going to stop there.

In the last six months, there have been several events related to the company's content filtering products. It's about about the release of the fourth version of the Dozor-Jet mail message monitoring and archiving system (SMAP) and the start of development of the second version of the Dozor-Jet web traffic control system (SCVT). Both products have many differences and innovations compared to previous versions.

In addition to the above products, the company has developed other software also related to the problems of content filtering - data type definition libraries and a data type definition and unpacking module for Lotus/Cerberus.

SMAP "Dozor-Jet"

So, in the fourth version of the Dozor-Jet SMAP, new features have been implemented that provide more high level filtering mail flows. The changes that affected the system can be divided into several sections:

general changes;
changes in the filtration subsystem;
changes in the control subsystem;
changes in expansion modules.

Some changes radically distinguish this product from those offered by other companies. This will be discussed in the relevant sections.

The transition to a new version is carried out without losing the established filtering policies - data migration tools and documentation on their use have been developed for this purpose.

Additional information about SMAP "Dozor-Jet" can be found on the product website of the Jet Infosystems company: http://www.jetsoft.ru/ or from other issues of the Jet Info bulletin, electronic version:.

General changes

General changes include those that affect all parts of the system, for example:

Unicode support - all parts of the system use Unicode as the internal data encoding. This change allows the system to be used in multilingual environments and support different languages for both mail processing and the user interface. It is possible to instantly switch the language of the control subsystem. Currently, it supports Russian, English and Japanese languages;
the database schema was redesigned, which made it possible to increase the speed of adding messages to the archive and speed up searches in the archive;
sending messages is separated into a separate subsystem, this has increased the reliability of message processing and simplified integration with external mail systems;
The system now comes with standard policies that contain the most frequently used conditions, flags and other filtering policy objects;
Oracle 10g and PostgreSQL 8.x are used as databases, which made it possible to increase storage volumes without significantly changing the requirements for database servers. In addition, work is currently underway on a module for interacting with Microsoft SQL Server for enterprises that do not use the Oracle DBMS.

These changes significantly influenced the system architecture and eliminated most of the limitations and shortcomings that existed in previous versions.

Changes in the filtering subsystem

Changes to the filtration subsystem had a significant impact on system performance. These include:

The new letter parser implements mechanisms for "lazy" unpacking of letters and objects. This mechanism significantly increases productivity, since the message and its component objects are unpacked only when the appropriate condition is met (checking the file type, the presence of text in the file, the presence of unpacking errors).
The new type detection system allows you to very accurately determine the types of transmitted data and select the appropriate handlers.
The new system for determining languages and encodings correctly determines the encoding and language of text objects and converts them into the internal representation of the filtering subsystem - Unicode.
IN new version Marks are tied not to the message, as was the case before, but to message objects, which allows you to create more complex filtering policies, for example, whether all files are encrypted, or determine which file caused the decompression error.
New filtering conditions have appeared in the filtration subsystem:
- condition for checking the time of day - makes it possible to implement delayed delivery of messages, for some of their types, for example, containing files with audio and video information;
- a condition to check the day of the week can be used to detect unusual activity on non-working days.
New actions were also implemented:
- delayed delivery of letters - usually used in conjunction with other conditions, such as the size of the letter or the time of sending, and ensures that letters are sent after a specified time;
- Priority mail delivery ensures expedited delivery of certain types of mail.
New unpackers and converters:
- added support for 7zip, deb, rpm and cpio archives;
- a text file analyzer has been added, which allows you to extract binary data encoded using base64, uuencode and quoted-printable from text. This utility correctly handles incorrectly forwarded letters and those cases when users try to encode forwarded data to deceive the system;
- files, rar, attached to files of other types - MS Word, tiff, jpeg and others are processed correctly;
- extracted text comments from all types of archives and mp3 files.
Anti-virus support is now implemented only using the ICAP protocol, which allows the use of the following antiviruses: Symantec, Trendmicro, DrWeb, Clamav (using the c-icap program), Kaspersky ICAP Server and others that have ICAP support.
It has become possible to connect message pre-processing modules, and now third-party anti-spam systems can be used to process messages.

Changes in the control subsystem

The system management interface has undergone dramatic changes. He received new design, Ajax technology was used for its implementation, which increased the speed of response to user actions of the system. A general view of the new user interface is presented at.

Figure 4. General view of the system user interface

Mail message archive segmentation module completely redesigned to provide better segment handling and enable automated segment management.

Reconstruction module rewritten to use new features - now you can delete parts of letters based not only on the type of part of the letter, but also on the marks set to this part.

Full-text archive search module is now included in the basic system delivery.

Digital signature support module provides verification of the digital signature of the letter, installing the digital signature on the letter and encrypting/decrypting the letter. Various encryption algorithms are supported, including GOST.

SKVT "Dozor-Jet"

The web traffic control system (SCVT) "Dozor-Jet" is a relatively new product of the company, but has already proven itself well among users. A year and a half has passed since its release; currently, active development of the second version of the Dozor-Jet SKVT is underway. The following changes are planned:

The user interface has been radically redesigned:
- Ajax technology is used to speed up the work of users, the interface becomes more similar to the interface of the 4th version of the Dozor-Jet SMAP;
- the user interface supports working with different languages - currently Russian, English and Japanese;
- management of auxiliary utilities (downloading data from the database, backup, etc.) is carried out through a web interface with the ability to configure work according to a schedule.
In the filtration subsystem:
- added filtering by any of the request or response headers;
- added filtering by HTTP protocol commands;
- a user can be authenticated using several criteria - name/password, IP address, MAC address;
- To establish a site category, both external databases of site categories and semantic analysis of site content are used. The system supports working with databases of website categories from NetStar and ISS/Cobion;
- support for the ICAP protocol has been implemented for interaction with anti-virus software;
- External converters can be used to analyze text in outgoing documents;
- Implemented notification to the administrator when specified conditions are triggered;
- POST requests can be saved to a database and can be analyzed later.
The reporting subsystem has been greatly redesigned:
- new standard reports have been added - Top-N users by traffic, Top-N most visited sites, Top-N most actively used file formats and others;
- implemented the ability to automatically generate reports on a schedule;
- output results in various formats - HTML with images, PDF, CSV.

The second version of the Dozor-Jet SKVT is planned for release at Russian market in the first quarter of 2007.

Firewall Z-2

The Z-2 firewall can be classified as a tool that implements the UTM concept. The system includes basic means of content filtering of transmitted data, including anti-virus scanning for all protocols for which protocol analyzers exist.

The new version implements anti-virus scanning of transmitted data via ICAP in HTTP, FTP, SMTP and POP3 gateways, which allows easy integration with a number of popular AV solutions.

The SMTP protocol gateway supports the SPF protocol and the graylisting mechanism. In combination with other SMTP stream processing capabilities, this can significantly reduce the number of unwanted messages before they are processed by content filtering tools, reducing the load on them.

Additional features include limiting bandwidth based on content type in the HTTP gateway.

Type definition system

Defining data types plays an important role when developing content filtering products. Many companies use the widely used file utility for this purpose, well known to users of UNIX-like operating systems. However, this utility has numerous shortcomings that lead to frequent type inference errors. Therefore, the company has developed its own type determination system, which makes this operation very accurate.

The new data type definition system has the following capabilities:

a specialized language for describing data type checks allows you to implement very complex checks. It is a full-fledged programming language with the following features:
- Data types supported: numbers (big & little endian), strings, characters, lists;
- support for many operations - comparisons, arithmetic, bitwise, logical;
- direct and indirect addressing of the data being checked is supported, which allows you to analyze information based on information read at previous stages of analysis;
- support for conditional statements allows you to make conditions more flexible;
- formatted output allows you to control the output of results;
- the ability to expand the checking language allows you to analyze even very complex structures.
It is possible to connect additional analysis modules. At the moment, the following additional analysis modules exist:
- module for determining types for OLE files - Microsoft Visio, Project, Word, Excel, PowerPoint;
- module for determining text and methods for encoding it;
- module for detecting MS-DOS executable files (.com files).
Explicit mapping of signatures to mime types avoids duplication of information (which is present in the standard file utility).

Data type detection and unpacking module for Lotus/Cerberus

This module is an addition to the Cerberus software for Lotus Domino, which performs the functions of analyzing messages transmitted as part of the Lotus Domino system. However, this system has little ability to analyze the types of transmitted data. This module was developed to solve this problem.

The module allows you to perform the following tasks:

maintain a list of permitted and prohibited data types;
for allowed data types, you can specify an additional processing command of this type files, for example, unpack an archive or extract text from Microsoft Word files.

The module operates under the Microsoft Windows operating system and is currently being successfully used in one of the largest Russian banks.

Conclusion

The development of information systems leads to the emergence of more and more new threats. Therefore, the development of content filtering products not only keeps up, but sometimes even anticipates the emergence of new threats, reducing risks for the protected information systems.

Internet Content Adaptation Protocol (RFC, subject to errata) specifies how an HTTP proxy (an ICAP client) can outsource content adaptation to an external ICAP server. Most popular proxies, including Squid, support ICAP. If your adaptation algorithm resides in an ICAP server, it will be able to work in a variety of environments and will not depend on a single proxy project or vendor. No proxy code modifications are necessary for most content adaptations using ICAP.

Pros: Proxy-independent, adaptation-focused API, no Squid modifications, supports remote adaptation servers, scalable. Cons: Communication delays, protocol functionality limitations, needs a stand-alone ICAP server process or box.

One proxy may access many ICAP servers, and one ICAP server may be accessed by many proxies. An ICAP server may reside on the same physical machine as Squid or run on a remote host. Depending on configuration and context, some ICAP failures can be bypassed, making them invisible to proxy end-users.

ICAP Servers

While writing yet another ICAP server from scratch is always a possibility, the following ICAP servers can be modified to support the adaptations you need. Some ICAP servers even accept custom adaptation modules or plugins.

Traffic Spicer (C++)

POESIA (Java)

(Java and Javascript)

original reference implementation by Network Appliance.

The above list is not comprehensive and is not meant as an endorsement. Any ICAP server will have unique set of pros and cons in the context of your adaptation project.

More information about ICAP is available on the ICAP Forum. While the Forum site has not been actively maintained, its members-only newsgroup is still a good place to discuss ICAP issues.

Squid Details

Squid-3.0 and later come with integrated ICAP support. Pre-cache REQMOD and RESPMOD vectoring points are supported, including request satisfaction. Squid-2 has limited ICAP support via a set of poorly maintained and very buggy patches. It is worth noting that the Squid developers no longer officially support the Squid-2 ICAP work.

Squid supports receiving 204 (no modification) responses from ICAP servers. This is typically used when the server wants to perform no modifications on a HTTP message. It is useful to save bandwidth by preventing the server from sending the HTTP message back to Squid as it was received. There are two situations however where Squid will not accept a 204 response:

The size of the payload is greater than 64kb.
The size of the payload cannot be (easily) assured.

The reason for this is simple: If the server is to respond to Squid with a 204, Squid needs to maintain a copy of the original HTTP message in memory. These two basic requirements are a basic optimization to limit Squid's memory usage in supporting 204s.

Squid Configuration

Squid 3.1

Squid-3.1

icap_enable on icap_service service_req reqmod_precache bypass=1 icap://127.0.0.1:1344/request adaptation_access service_req allow all icap_service service_resp respmod_precache bypass=0 icap://127.0.0.1:1344/response adaptation_access service_resp allow all

adaptation_access

adaptation_service_set

icap_client_username_encode

icap_client_username_header

icap_connect_timeout

icap_default_options_ttl

icap_enable

icap_io_timeout

icap_persistent_connections

icap_preview_enable

icap_preview_size

icap_send_client_ip

icap_send_client_username

icap_service

icap_service_failure_limit

icap_service_revival_delay

Squid 3.0

The following example instructs Squid-3.0 to talk to two ICAP services, one for request and one for response adaptation:

icap_enable on icap_service service_req reqmod_precache 1 icap://127.0.0.1:1344/request icap_class class_req service_req icap_access class_req allow all icap_service service_resp respmod_precache 0 icap://127.0.0.1:1344/response icap_class class_resp service_resp icap_access class_resp allow all

There are other options which can control various aspects of ICAP:

For correct integration of the system, you must also configure the organization's proxy server. General requirement One of the settings is the need to configure the IP address of the SecureTower ICAP server on the proxy server. To do this, the proxy server's ICAP module must be configured so that the header of the request sent to the ICAP server includes the X-Client-IP field containing the user's IP address. Requests without the specified IP address will be accepted, but will not be serviced by the ICAP server.

Among others, SecureTower supports integration with the most popular proxy servers SQUID and MS Forefront.

SQUID

The SecureTower system supports SQUID versions older than 3.0. When installing/compiling a proxy server, you must activate the option to enable ICAP support and specify the following options in the ICAP settings:

icap_enable on
icap_send_client_ip on - client IP address
icap_service_req service_reqmod_precache 0 icap://192.168.45.1:1344/reqmod, where 192.168.45.1 is the IP address of the SecureTower ICAP server
adaptation_access service_req allow all

MS Forefront

To work in networks organized on the basis of the TMG Forefront proxy server, you must additionally install the ICAP plugin, because By default, ICAP is not supported by this proxy server. The plugin is available at http://www.collectivesoftware.com/solutions/content-filtering/icapclient.

In the ICAP plugin settings you need to specify the address of the SecureTower ICAP server. As a result, all data transferred to the HTTP(S) protocol through the MS Forefront proxy server will be saved by the SecureTower ICAP server.

Minimum system requirements for ICAP server

Processor: 2 GHz or higher, 2 cores or more
Network adapter: 100 Mbit/1 Gbit
RAM: at least 6 GB
Hard disk: 100 GB partition for the operating system and SecureTower files; the second section for storing intercepted data at the rate of 1.5 GB of data from each controlled user per month plus 3% of the volume of intercepted data for search index files
Windows .Net Framework: 4.7 and higher
operating system: Microsoft Windows Server 2008R2/2012/2016 x64

Currently, content filtering cannot be identified as a separate area of computer security, as it is so intertwined with other areas. In ensuring computer security, content filtering is very important because it allows you to identify potentially dangerous things and process them correctly. Approaches that emerged from the development of content filtering products are being used in products to prevent intrusion detection (IDS), the spread of malicious code, and other negative activities.

Based on new technologies and products in the field of content filtering, additional services are created for users, the quality of protection is improved and the ability is provided not only to process existing threats, but also to prevent entire classes of new threats.

New trends in content filtering

One of general trends development of information security products - the desire to implement various functions in one device or software solution. As a rule, developers try to implement solutions that, in addition to content filtering functions, also perform the functions of an antivirus, firewall and/or intrusion detection and prevention system. On the one hand, this allows companies to reduce the costs of purchasing and maintaining security systems, but on the other hand, the functionality of such systems is often limited. For example, in many products, Web traffic filtering functions are limited to only checking site addresses against some database of site categories.

This area also includes the development of products in accordance with the Unified Threat Management concept ( UTM), which provides a unified approach to threat prevention regardless of which protocol or data is processed.

This approach allows you to avoid duplication of protection functions, as well as ensure that data describing threats is up-to-date for all monitored resources.

In the areas of content filtering that have existed for quite some time—control mail and Internet traffic—changes are also taking place, and new technologies are emerging.

In products for monitoring email traffic, the anti-phishing feature has begun to come to the fore. And in products for monitoring Internet traffic, there is a shift from using pre-prepared address databases to categorization by content, which is a very important task when working with a variety of portal solutions.

In addition to the two areas mentioned above, new areas of application of content filtering are emerging - some time ago, products began to appear to monitor the transfer of instant messages (instant messaging) and peer-to-peer (p2p) connections. Currently, products for monitoring VoIP traffic are also being actively developed.

Many countries have actively begun to develop means for intercepting and analyzing many types of information that is used for various types of investigations (lawful interception). These events are held at state level and are most often associated with the investigation of terrorist threats. Such systems intercept and analyze not only data transmitted via the Internet, but also via other types of communication - telephone lines, radio channels, etc. Most known system to intercept information is Echelon - a system used by American intelligence to collect information. In Russia, there are also various implementations of the system of operational investigative measures (SORM), which are used to capture and analyze information in the interests of intelligence services.

One of the trends in the market for content filtering products is the massive consolidation of companies producing such solutions. Although this trend largely reflects the organizational side of the process, it can lead to the emergence of new products and directions for companies that did not have these directions, or they occupied a small part of the market sector of such companies. The above can be illustrated by the following cases of mergers/acquisitions of companies:

Secure Computing, which last year bought Cyberguard, which has a good range of Internet traffic filtering tools, merged in the summer with another company, CipherTrust, which has extensive experience in developing tools for filtering email traffic;
the MailFrontier company, which produced tools for protecting email traffic, was absorbed by SonicWall, which previously did not have solutions with such quality development;
at the end of July 2006, SurfControl, known for its solutions in the field of content filtering, bought BlackSpider, which provided advanced computer security services;
At the end of August 2006, the most ambitious takeover took place - Internet Security Systems (ISS) signed a merger agreement with IBM. This merger is an example of the strong interest in information security among large software companies;
In January 2007, Cisco acquired IronPort, which has a strong line of email security products;
Over the past few years, Microsoft has acquired several companies involved in information security. The largest of these was the acquisition of Sybari, with its line of protection products against viruses and other malicious code, as well as content filtering tools for email and instant messages. The acquisition of Sybari and other companies allows Microsoft to successfully compete in the computer security market that is new to it.

It is also worth noting that in last years Open source products for content filtering began to appear. In most cases, they do not achieve the same functionality as commercial applications, but there are specific solutions and applications where they can pose a real threat.

Modern threats

Modern IT infrastructure is subject to many attacks, targeting both ordinary users and companies, regardless of their size. The most relevant types of threats are:

Phishing— recently widespread methods of intercepting important user data (passwords, credit card numbers, etc.) using social engineering techniques, when a false letter or message from a particular organization tries to force the user to enter certain data on a site controlled by an attacker;
Spyware & Malware- various tools that allow you to intercept data or gain control over a computer. There are many types of such tools, which vary in the degree of danger to the computer - from simply displaying advertising messages to intercepting data entered by users and seizing control of computer operations;
viruses and other malicious code— viruses, worms and Trojans are a long-known threat to IT infrastructure. But every year new modifications of malicious code appear, which often exploit vulnerabilities in existing software, which allows them to spread automatically;
SPAM/SPIM- Unsolicited messages sent via email (SPAM) or instant messaging (SPIM) cause users to waste time processing unsolicited correspondence. Currently, SPAM accounts for more than 70% of all transmitted email messages;
attacks on infrastructure— The IT infrastructure of companies is very important, attacks aimed at disabling it are extremely dangerous. They can involve entire networks of computers infected with some kind of virus used to intercept control. For example, some time ago a virus was distributed that contained code that was supposed to launch a distributed attack on Microsoft websites at a certain time in order to disable them. Several million computers were infected, and only an error in the virus code prevented the planned attack from being carried out;
business information leak— preventing such leaks is one of the main tasks of content filtering products. A leak of important information can cause irreparable damage to a company, sometimes comparable to the loss of fixed assets. Therefore, many products are developing tools for identifying hidden data transmission channels, such as the use of steganography;
threat of prosecution— this type of threat is extremely relevant for companies if their employees can use file-sharing networks to download and/or distribute music, films and other content protected by copyright. Prosecution is also possible for the dissemination of libelous and/or defamatory information concerning third parties.

The first five types of threats expose both home computers and computers on corporate networks. But the last two threats are especially relevant for companies of all types.

Web traffic filtering

Recently, various changes have been taking place in the field of Internet traffic filtering, due to the emergence of new filtering technologies and changes in the technologies that are used to build Internet sites.

One of the most important trends in the development of content filtering products in terms of monitoring Internet traffic is the transition from using databases of site categories to determining the category of a site by its content. This has become especially relevant with the development of various portals, which may contain content of different categories, changing over time and/or adjusted to the client’s settings.

Technologies and tools for building Internet sites that have recently become popular, such as Ajax, Macromedia Flash and others, require changes in Internet traffic filtering technologies.

The use of encrypted channels for interaction with Internet sites protects data from interception by third parties, but at the same time, through these data transmission channels, important information can leak or malicious code penetrates computer systems.

The problem of integrating security tools with systems that ensure the functioning of the IT infrastructure, such as proxy servers, web servers, mail servers, directory servers, etc. remains relevant. Various companies and non-profit organizations are developing protocols for interaction between different systems.

ABOUT current situation affairs in this area we'll talk below.

Approaches to categorizing sites and data

using predefined databases of site categories with regular updates lists of sites and categories;
categorize data on the fly by analyzing page content;
use of data about a category, information about membership of which is provided by the site itself.

Each of these methods has its own advantages and disadvantages.

Predefined databases of site categories

Using pre-prepared databases of website addresses and associated categories is a long-used and well-proven method. Currently, such databases are provided by many companies, such as Websense, Surfcontrol, ISS/Cobion, Secure Computing, Astaro AG, NetStar and others. Some companies use these databases only in their products, others allow them to be connected to products of third parties. The databases provided by Websense, Secure Computing, SurfControl and ISS/Cobion are considered the most complete; they contain information about millions of sites in different languages and in different countries, which is especially important in the era of globalization.

Data categorization and the formation of category databases are usually carried out in a semi-automatic mode - first, content analysis and category determination are performed using specially developed tools, which may even include image text recognition systems. And at the second stage, the information received is often checked by people who make decisions about which category a particular site can be classified into.

Many companies automatically replenish the category database based on the results of work with clients if a site is discovered that has not yet been assigned to any of the categories.

Currently, two methods are used to connect predefined databases of site categories:

using a local category database with regular updates. This method is very convenient for large organizations that have dedicated filtering servers and serve a large number of requests;
using a category database hosted on a remote server. This method is often used in various devices - small firewalls, ADSL modems, etc. Using a remote category database slightly increases the load on channels, but ensures that the current category database is used.

The advantages of using predefined category databases include the fact that access is granted or denied at the stage of issuing a request by the client, which can significantly reduce the load on data transmission channels. And the main disadvantage of using this approach is the delay in updating the site category databases, since analysis will take some time. In addition, some sites change their content quite often, which is why the category information stored in the address database becomes irrelevant. Some sites may also provide access to various information, depending on the username, geographic region, time of day, etc.

Categorize data on the fly

One of simple options implementation of such a solution is the use of Bayesian algorithms, which have proven themselves quite well in the fight against spam. However, this option has its drawbacks - it is necessary to periodically relearn it and adjust the dictionaries in accordance with the transmitted data. Therefore, some companies use more complex algorithms determining the category of a site by content in addition to simple methods. For example, ContentWatch provides a special library that analyzes data according to linguistic information about a particular language and, based on this information, can determine the category of data.

Categorizing data on the fly allows you to quickly respond to the emergence of new sites, since information about the category of a site does not depend on its address, but only on its content. But this approach also has disadvantages - it is necessary to analyze all transmitted data, which causes a slight decrease in system performance. The second drawback is the need to maintain up-to-date category databases for different languages. However, some products take this approach while simultaneously using site category databases. This includes the use of Virtual Control Agent in SurfControl products, mechanisms for determining data categories in the Dozor-Jet SKVT.

Category data provided by sites

In addition to address databases and on-the-fly content categorization, there is another approach to determining the category of sites - the site itself reports which category it belongs to.

This approach is primarily intended for use by home users, where, for example, parents or teachers can set filtering policies and/or monitor which sites are visited.

There are several ways to implement this approach to resource categorization:

PICS (Platform for Internet Content Selection) is a specification developed by the W3 consortium about ten years ago and has various extensions aimed at ensuring the reliability of the rating system. For control, specially developed software can be used, available for download from the project page. More detailed information PICS can be found on the W3.org consortium website (http://www.w3.org/PICS/).
ICRA (Internet Content Rating Association) is a new initiative developed by an independent non-profit organization of the same name. The main goal of this initiative is to protect children from accessing prohibited content. This organization has agreements with many companies (major telecommunications and software companies) to provide more reliable protection.
ICRA provides software that allows you to check the special label returned by the site and make decisions about access to this data. The software runs only on the Microsoft Windows platform, but thanks to the open specification, it is possible to create filtering software implementations for other platforms. The goals and objectives solved by this organization, as well as all the necessary documents can be found on the ICRA website - http://www.icra.org/.

Traffic filtering in the world of Web 2.0

The massive introduction of so-called Web 2.0 technologies has greatly complicated content filtering of web traffic. Because in many cases the data is transferred separately from the design, there is the possibility of undesired information being leaked to or from the user. When working with sites that use such technologies, it is necessary to do a comprehensive analysis of the transmitted data, determining the transfer of additional information and taking into account the data collected at previous stages.

Currently, none of the companies producing tools for content filtering of web traffic allows for comprehensive analysis of data transmitted using AJAX technologies.

Integration with external systems

ICAP protocol

ICAP has been adopted by the Internet Engineering Task Force (IETF) as a standard. The protocol itself is defined by RFC 3507 with some additions outlined in the ICAP Extensions draft. These documents and additional information are available from the ICAP Forum server - http://www.i-cap.org.

The system architecture when using the ICAP protocol is shown in the figure above. The ICAP client is the system through which traffic is transmitted. The system that performs data analysis and processing is called an ICAP server. ICAP servers can act as clients for other servers, which makes it possible to connect several services to collectively process the same data.

The decision about which of the transmitted data will be processed is made by the ICAP client, in some cases this makes it impossible to fully analyze the data. Client settings are entirely implementation dependent, and in many cases cannot be changed.

ICAP is most widely used in anti-malware products because it allows these checks to be used across multiple products and is independent of the platform on which the ICAP client is running.

The disadvantages of using ICAP include the following:

additional network interactions between the client and the server somewhat slow down the speed of data transfer between external systems and information consumers;
There are checks that need to be performed not on the client, but on the ICAP server, such as determining the data type, etc. This is relevant because in many cases ICAP clients rely on the file extension or data type reported by the external server, which may cause a security policy violation;
Difficult integration with systems using protocols other than HTTP prevents the use of ICAP for deep data analysis.

OPES protocol

The structure of interaction between OPES servers and clients (OPES Processor) is shown in the figure. In general terms, it is similar to the scheme of interaction between ICAP servers and clients, but there are significant differences:

there are requirements for the implementation of OPES clients, which makes it possible to more conveniently manage them - setting filtering policies, etc.;
The data consumer (user or information system) can influence the processing of data. For example, when using automatic translators, the received data can be automatically translated into the language used by the user;
systems providing data can also influence the results of processing;
Processing servers can use for analysis data specific to the protocol through which the data was transmitted to the OPES client;
Some processing servers may obtain more sensitive data if they have a trusted relationship with the OPES client, consumers, and/or information providers.

Products that support OPES along with the ICAP protocol are expected to appear in the near future. But since there are currently no full-fledged implementations using OPES, it is impossible to draw final conclusions about the shortcomings of this approach, although theoretically there remains only one drawback - the increase in processing time due to interaction between clients and OPES servers.

HTTPS and other types of encrypted traffic

Some analysts estimate that up to 50% of Internet traffic is transmitted in encrypted form. The problem of controlling encrypted traffic is now relevant for many organizations, since users can use encryption to create information leakage channels. In addition, encrypted channels can also be used by malicious code to penetrate computer systems.

There are several tasks associated with processing encrypted traffic:

analysis of data transmitted over encrypted channels;
checking certificates that are used by servers to organize encrypted channels.

The relevance of these tasks is increasing every day.

Encrypted data transmission control

The data processing process is as follows:

A specially issued root certificate is installed in the user's Internet browser, which is used by the proxy server to sign the generated certificate (without installing such a certificate, the user's browser will display a message stating that the signing certificate was issued by an untrusted organization);
when a connection is established with a proxy server, data is exchanged, and a specially generated certificate with data from the destination server, but signed with a known key, is sent to the browser, which allows the proxy server to decrypt the transmitted traffic;
decrypted data is analyzed in the same way as regular HTTP traffic;
the proxy server establishes a connection with the server to which the data must be transferred and uses the server certificate to encrypt the channel;
The data returned from the server is decrypted, analyzed and transmitted to the user, encrypted with a proxy server certificate.

When using this scheme for processing encrypted data, problems may arise related to confirming the authenticity of the user. In addition, the work required to install the certificate in the Internet browsers of all users is required (if such a certificate is not installed, the user will receive a message stating that the certificate is signed by an unknown company, which will give the user information about monitoring the data transfer).

The following products are currently offered on the market for monitoring the transmission of encrypted data: Webwasher SSL Scanner from Secure Computing, Breach View SSL, WebCleaner.

Certificate authentication

The second challenge that arises when using encrypted data transmission channels is verifying the authenticity of certificates provided by the servers with which users work.

To prevent such cases, there is specialized software that checks the compliance of the certificates provided by the server with the data they report.

Mail traffic filtering

When using email, organizations are faced with the need to provide security for both incoming and outgoing traffic. But the tasks solved for each direction are quite different. Inbound traffic needs to be controlled for malware, phishing, and spam, while outbound mail is controlled for content that could leak sensitive information, disseminate incriminating materials, and the like.

There are currently several ways to protect users from spam:

comparison of received messages with the existing message database. When making comparisons, various techniques can be used, including the use of genetic algorithms, which make it possible to isolate keywords even if they are distorted;
dynamic categorization of messages by their content. Allows you to very effectively detect the presence of unwanted correspondence. To counter this method, spam distributors use messages in the form of an image with text inside and/or sets of words from dictionaries, which create noise that interferes with the operation of these systems. However, to combat such spam, various methods are already being used, such as wavelet analysis and/or text recognition in images;
grey, white and black access lists allow you to describe the policy for accepting email messages from known or unknown sites. The use of gray lists in many cases helps prevent the transmission of unwanted messages due to the specific operation of the software that sends spam. To maintain access blacklists, both local databases managed by the administrator and global databases, replenished based on user messages from all over the world, can be used. However, the use of global databases risks the fact that entire networks, including those containing “good” mail servers, may fall into them.

To combat information leaks, a variety of methods are used, based on the interception and deep analysis of messages in accordance with a complex filtering policy. In this case, there is a need to correctly determine file types, languages and text encodings, and conduct a semantic analysis of transmitted messages.

Filtering instant messages

Currently, the most commonly used protocols for instant messaging are MSN (Microsoft Network), AIM (AOL Instant Messaging), Yahoo! Chat, Jabber and their corporate counterparts are Microsoft Live Communication Server (LCS), IBM SameTime and Yahoo Corporate Messaging Server protocols. The ICQ system, which is now owned by AOL and uses almost the same protocol as AIM, has become widespread in the CIS. All of these systems do almost the same thing - they transmit messages (both through the server and directly) and files.

The most popular functions of products for monitoring IM traffic:

access control using individual protocols;
control of clients used, etc.;
access control for individual users:
allowing the user to communicate only within the company;
allowing the user to communicate only with certain users outside the company;
control of transmitted texts;
file transfer control. The objects of control are:
- file size;
- file type and/or extension;
data transfer direction;
monitoring the presence of malicious content;
SPIM definition;
saving transmitted data for subsequent analysis.

Currently, the following products allow you to control the transmission of instant messages:

CipherTrust IronIM by Secure Computing. This product supports the AIM, MSN, Yahoo! protocols. Chat, Microsoft LCS and IBM SameTime. This is now one of the most complete solutions;
IM Manager from Symantec (developed by IMLogic, which was acquired by Symantec). This product supports the following protocols - Microsoft LCS, AIM, MSN, IBM SameTime, ICQ and Yahoo! Chat;
Microsoft's Antigen for Instant Messaging also supports virtually all popular instant messaging protocols.

Products from other companies (ScanSafe, ContentKeeper) have fewer capabilities than those listed above.

It is worth noting that two Russian companies - Grand Prix (SL-ICQ product) and Mera.ru (Sormovich product) - provide products for monitoring the transmission of messages using the ICQ protocol.

VoIP filtering

There are standardized protocols for exchanging such information, including Session Instantiation Protocol (SIP), adopted by the IETF, and H.323, developed by the ITU. These protocols are open, which makes them possible to process.

In addition, there are protocols developed by specific companies that do not have open documentation, which makes working with them very difficult. One of the most popular implementations is Skype, which has gained widespread popularity around the world. This system allows you to make calls between computers, make calls to landlines and mobile phones, and receive calls from landlines and mobile phones. The latest versions support the ability to exchange video information.

Most of the currently available products can be divided into two categories:

products that allow you to identify and block VoIP traffic;
products that can identify, capture and analyze VoIP traffic.

Dolphian products that allow you to identify and allow or deny VoIP traffic (SIP and Skype) that is encapsulated in standard HTTP traffic;
Verso Technologies products;
different types of firewalls that have this capability.

the product of the Russian company Sormovich supports the capture, analysis and storage of voice information transmitted via H.323 and SIP protocols;
The open source library Oreka() allows you to determine the signal component of audio traffic and capture the transmitted data, which can then be analyzed by other means.

Recently it became known that a product developed by ERA IT Solutions AG allows you to intercept VoIP traffic transmitted using Skype. But to perform such control, you need to install a specialized client on the computer on which Skype is running.

Peer-to-peer filtering

The use of various peer-to-peer (p2p) networks by employees poses the following threats to organizations:

distribution of malicious code;
information leak;
distribution of copyrighted data, which may result in legal action;
decreased labor productivity;

To solve this problem, many companies are creating products that allow them to detect and process p2p traffic. The following solutions exist for processing p2p traffic:

SurfControl Instant Messaging Filter, which handles p2p as well as instant messaging;
the Websense Enterprise package also provides users with tools to control p2p traffic;
Webwasher Instant Message Filter allows you to control access to various p2p networks.

The use of these or other products not listed here dramatically reduces the risks associated with user access to p2p networks.

Unified Threat Management

The most popular solutions of the Unified Threat Management concept are the following products:

SonicWall Gateway Anti-Virus, Anti-Spyware and Intrusion Prevention Service provides anti-virus and other protection for data transmitted via SMTP, POP3, IMAP, HTTP, FTP, NetBIOS, Instant Messaging protocols and many streaming protocols used to transmit audio and video information ;
a series of ISS Proventia Network Multi-Function Security devices, designed as software and hardware systems, block malicious code, unwanted messages and intrusions. The delivery includes a large number of checks (including for VoIP), which can be expanded by the user;
Secure Computing's Network Gateway Security hardware platform, in addition to protecting against malicious code and unwanted messages, also has VPN support. This platform combines almost all Secure Computing solutions.

There are other products, but the ones listed above are widely available.

Data interception

Perhaps the most famous of these systems is the Anglo-American Echelon system, which has long been used to intercept data in the interests of various departments in the United States and England. In addition, the US National Security Agency uses the Narus system, which allows monitoring and analysis of Internet traffic in real time.

Among Russian products, we can mention solutions from the Sormovich company, which allow you to capture and analyze email, audio, and various types of Internet traffic (HTTP and others).