Protecting the conversational systems from cyber attacks, IT News, ET CIO
By Bibhuti Kar

Automation is changing the way customers interact with services and systems. This can be attributed to the rapid evolution of technologies in line with the changing needs of users.

The chatbots are fast automating customer service. As of July 2020 (based on an Invespcro study), 67% of the customers used a chatbot. About 40% of the customers do not care if they are being served by a bot or a human being as long as they get their information. Millennials are more comfortable with texting than phone calls to traditional call centers.

More importantly, the automated interactive systems deployed by the businesses are becoming conversational, shedding the menu-driven, rule-based genesis of the IVR time. For example, popular chat apps (conversational) are doubling up as payment apps (rules-driven).

These conversational systems, backed by state of art machine learning algorithms will not only aim at delivering secure, private, and critical information to the user, but they would also be trained to do the work of the traditional insurance or car sales agent or a tax consultant. They are becoming contextually aware, and understand the demography of the user, have a massive dataset to guide the new buyers through a complex choice-making process. A preliminary diagnosis of a health situation or a mental health consultation can completely happen with a machine trained to do so.

A secured, private, yet completely human-like interaction would be the holy grail of user-business interaction in the not-so-distant future.

Are conversational systems immune to cyber-attacks?

The answer is an astounding ‘NO’. The basic reason being, the threat actors have access to the same technology and in many cases, the datasets used by the businesses. As per MITRE, in the last three years, biggies like Google, Amazon, Microsoft, and Tesla, have had their ML systems tricked, evaded, or misled.

This trend is only poised to rise: According to a Gartner report, 30% of cyberattacks by 2022 will involve data poisoning, model theft or adversarial examples. The rising trend in the software repositories on GitHub on tools, models for adversarial attacks is heartening as well as ominous.

Sign of time to come!

Let’s explore some of the potential threats and protection measures to these conversational systems.

Distributed Denial of Service

It is the easiest attack that can be launched by the threat actors, however, can cause a severe dent on the reputation of a business. Thousands of fake customers or would-be customers can engage the system in conversations draining the resources away from productive work they are meant to do. A few lines of software, openly available on the internet combined with powerful infrastructure (such as Emotet’s infra which is available on rent now) can cause this debilitating attack by relatively novice threat actors. A swarm of fake customers (scripts/bots) can swamp the services with queries to make them unusable.

The existing rule based DDos protection technologies will need to understand the conversation layer better to protect the systems. Not just computing resources, even BPO human staff can go on a wild goose chase responding to genuine looking queries (such as creating a quote for an insurance instrument etc.) or doing background verification/credit rating data collection on fake leads. Fake leadsare extremely expensive for the businesses.

NLP-based correlation engines, content filtering and context awareness need to be front-end and embed with the existing WAF functionalities. While the NLP/AI layer detects fraudulent intent, or geo-disperse attack co-ordination and develops the threat context, the traditional WAF layers would focus on rate-limiting, script tags in the conversation to prevent SQL Injection, etc. Conversation signatures and Turing tests based on the context will challenge the source and authenticate genuine users vs. bots in real-time while being humanely polite.

Model Theft

Most conversational systems train with OSINT (Open Source Intelligence) to start with and then add custom elements overtime. Threat actors also have access to the same data. They can shadow the APIs, to reveal ML model family, ontology and even weights.Alternatively, threat actors brute force starting with publicly available trained weights and mimic an established service. This is equivalent to stealing entire intellectual property that is at the core of today’s businesses.

Secure the training data and models for your conversational system using the highest grade encryption at rest and during transport. Update the models often with a well-oiled continuous delivery system and distribute different models to different user groups and geos based on that groups’ data analytics. For example, a sales lead generation model may have different weights in the US vs. India. Having two models in the US and India will double the model thief’s work to reverse engineer. Also, a lot of conversation needs to happen to mimic your organization’s model, which is possible only using scripted bots. A good bot detection layer combined with polite Turning test challenges will prevent such mass learning attacks from outside.

Data Poisoning and Intent manipulation

ML algorithms train continuously with more and more data as they come across. Let’s consider a car manufacturer that determines its inventory based on the number of queries on their automated ‘lead generation’ service. However, Malicious automated bots, posing as enquiring customers can orchestrate a set of queries from different places over a short period of time to alter the business analytics in a very different way.

A simple way to protect data and intent poisoning protection would be to keep interaction layer (e.g. a customer’s chat transcript or email complaint) data away from training data. Do not let the real-time interactions end up going into your learning set directly. It has to go through validation, filtering, summarization before being appended to the training data.

Data Exfiltration

Once a model is mimicked, simple model inversion techniques can be used to exfiltrate the training dataset. An effective protection scheme described for DDos and Data Poisoning can prevent this ultimate evildoing. The best protection to Data Exfiltration, in this case, is not to let your AI dataset being attacked or tampered with as described in the previous sections.

AI vs. AI

The time for providing human-like automated service is here and now. The AI/ML technology to create this is advanced and already in production. Protecting such systems from various possibilities of malice is lagging though. However, the start-up scene is abuzz with innovations in this space to plug that gap. It’s clearly the next frontier in cyber-security and a worthy one to watch closely.

(The author is Head of R&D, Quick Heal Technologies)

(Excerpt) Read more Here | 2021-03-12 03:20:34
Image credit: source

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.