Unsupervised Machine Learning Inspired by the Human Immune System

Ammune™ is a Revolutionary "Unsupervised Learning" Technology

Ammune™ identifies and prioritizes “patterns” from “unlabeled” data. Classification or categorization is not included in the observations, so only its possible impact on the system health is analyzed. No pre-training rules are required by the process. As Ammune™ doesn’t require any sample data, no external evaluation is made for the accuracy of the generated “patterns”, unlike “supervised learning” and “reinforcement learning”, which demand examples to perform. Interestingly, the correlation process between “patterns” and damages is reminiscent of other natural processes such as the essential co-“firing” of neurons seen in neural networks.

How Ammune™ Works

Ammune™ is inspired by the “Danger Model” which is an important part of the natural immune system in our body that is responsible for detecting and mitigating foreign threats such as harmful viruses and bacteria. Ammune™ uses the following sequence of activities, in an “always on” model. (1) It collects samples of traffic data at a pre-set time intervals. (2) Traffic data is used to continuously check the web system health status i.e. its responsiveness. (3) When such a problem is detected, a root-cause analysis starts in order to identify if it is actually a real attacks as well as its related patterns. (4) When an attack is identified as well as its possible patterns, the system uses this information to generate optimal traffic “signatures”. The “signatures” are used to identify and stop malicious requests as well as to allocate the IP sources (the “bots”).

Ammune™ Operation

1. Collecting Traffic Samples

Ammune™ continuously collects samples of web traffic at layers 3/4 and 7. The sampled traffic can be encrypted or open/decrypted and is used to detect attacks coming through:

  • Network layers (3/4) attacks – DDoS attacks detected through encrypted or open / decrypted network traffic.
  • Application layer (7) attacks – This class of attacks are usually made by bots. The attacks may cause system operation disruption by applicative DDoS attacks, still data by scrapping the web pages or cause other damages. It is detected by analyzing open/decrypted network traffic.
2. Monitoring System Health

Ammune™ monitors the health of the web systems through identifying changes its responsiveness-related calculated features at a given time frame (made out of the sampled traffic). As the system responsiveness seems to damaged, Ammune™ moves to its next stage for investigating for the causes of the damage.

3. Analyzing Root-Cause of Attacks

As responsiveness damage is detected, patterns are extracted from the sampled input data (header, URL, parameters), used previously for the health monitoring. Relevant patterns are distinguished from normal traffic by correlation with the health damage at hand. When an appropriate pattern(s) is identified, Ammune™ continues to its next stage of generating optimized signatures. Otherwise, the system may alert on other types of possible causes.

4. Generating optimized signatures

As patterns related to a given attacks were generated, Ammune™ performs an optimization process in which patterns may serve as part of signatures. The major consideration during this process is for signatures to be able to distinguish normal from malicious traffic while still being relevant to a significant portion of the attacking traffic.

The Way Forward

Within the cybersecurity realm attackers are moving fast, accurate and dynamically. Similar to moving from ice to water, yet the ice is not moving, so you need now, not a hammer for the ice, a device that can analyze the water to determine a poison ingredient moving very fast and in disguise. This is why you need to move from supervised to unsupervised learning.  

Since we are not able to cover the entire cybersecurity space with examples, why not give the system tools to efficiently explore the space instead. Let’s give the system a hook and that hook becomes the base letting the system figure out the rest for itself. 

An effective security posture requires a continuous adaptable model that can follow traffic characteristics and look for specific problems. A system that is active, not passive, implying results immediately with no human intervention.  

Unsupervised learning does not take over from a human but works alongside security teams, hardening the gates, together acting as the cyber hunter. 

The L7 Defense solution - Controllable False Positives

The L7 Defense solution has several algorithms working alongside a baseline algorithm updated based on current environment variables every few seconds.  

 If one of the algorithms identifies an attack, the system stops itself. Once all algorithms are in consensus the baseline is triggered to determine if there is an anomaly or not. 

The L7 Defense solution has an internal mechanism to rectify the false positive issue. It can quantify and control the rate of false positives, bringing it to a controllable number. 

The safety belt of internal mechanisms controls the rate of false positives the system will meet while setting signatures.  

Signatures are adaptive to a fixed false positive number already set. Signatures generated must fit the threshold number, not going above it.  

What is Unsupervised Learning

With unsupervised learning, there is no teacher or a narrow curriculum. The curriculum is guided but left to its own accord, dynamically adapting, learning and developing itself based on changing students needs. 

What is Supervised Learning

Supervised learning is similar to having a teacher with an existing curriculum along with a specific set of questions and answers. The curriculum does not instantly change and the teacher will not teach additional topics without sufficient training.  

The Problematic Landscape

Traffic is unpredictable, changing in an instant such as the amplitude or more sophisticated changes in the header or message body.  

In an environment of hundreds of thousands of applications each having a unique traffic set, it becomes a constant challenge to characterize every parameter and carry out time series analysis.  

Uncontrolled and unpredictable traffic patterns lay the foundations to a perfect breeding ground for cybercriminals to camouflage an attack. 

Unsupervised Learning in CyberSecurity

Unsupervised learning not simply a model, like the supervised counterpart, it’s a continuous adaptable model changing based on traffic patterns and environmental variables. 

Administrators do not provide examples or give rules to the system and the feature list does not cover everything. It learns by itself the rules of the game, acting as an answer provided learning system.  

A guide is provided but not the entire distribution of space of possibilities. It’s not a closed problem.  

Unsupervised learning has the ability to change as the problem itself changes. The ability to dynamically adapt to any situation without human intervention and actively respond to unexpected environmental conditions puts it as the leading solution to defend against advanced AI-based machine attacks. 

Supervised Learning in CyberSecurity

Uncontrolled and unpredictable traffic patterns offer a breading frenzy for cybercriminals. Traffic is not stable and this is a major problem for supervised learning.

Supervised learning will work only for a small range of familiar features. However, security attacks are changing by the second so the entire paradigm of supervised learning is useless because the problem is rapidly directed to unexpected areas. 

It learns by examples, the examples the model acts on are the distribution of space of possibilities, unable to generate anything outside of that space. After enough examples, it’s a closed problem.

Supervised offers some clue in a static situation but not in the dynamic situation in which cybercriminals operate in today.