Posted by Aible ● Feb 28, 2021 2:40:38 PM

The Key to AI that Delivers Impact: Make IT Core to the AI Process

47_ITcoretoAI-1

Organizations typically have three distinct data systems – operational systems, which are managed by IT, analytical systems, and systems of integration. Artificial intelligence, with its roots in data science, is trained on the analytics systems. The problem is, AI only creates value for business when it is deployed in the systems of operations. That represents a fundamental disconnect  – one that can best be bridged by making IT a core part of the AI process. IT has the front row seat to all of the data needed for AI. 

The goal of AI is straightforward – to give the right actionable information to someone at the moment of making a decision. You want the AI to tell a salesperson the likelihood of whether a deal will close or to tell a marketer which are the right campaigns and where to spend the money. That means that business decisions that are taken based on the AI happen in the transactional systems, the systems of operation. But AI is traditionally trained by and in the systems of analytics, which are very different from the systems of operation, and that creates several problems.

First, the data in different systems doesn’t look the same.

As anyone in IT or data engineering will tell you, the data dictionaries for the systems of operations and the systems of analytics are very dissimilar. The data doesn’t look the same. The variables are different and aren’t aggregated at the same level. The same variable name in the transactional system may not match the variable name in the analytical system because someone has changed the definition or renamed the variable.

Thus, when the AI trained in the analytics system is deployed in the operational system, it is like learning French and then having to speak German. Yes, most of the alphabet and some of the words are similar, but you would be much better off learning German if you were going to speak German.

Second, systems of analytics throw away data available in systems of operations.

Systems of operation have an overwhelming amount of data. For example, a system may record every individual interaction an user has with a website. But that would be way too much unnecessary detail when you want to analyze the data. So typically analytical systems transform the raw data into “useful” data such as number of pageviews.  As a result, a significant amount of information is lost whenever you go to a system of analytics as opposed to a system of operation.

Consider this example. Let’s say I’m creating an AI to predict whether a customer will buy. To do that, I’ll typically take customer records from the analytical system and give that to the AI and tell it to predict based on that information. But the transactional system actually has much more information to offer than the analytical system. The transactional system has information such as how many times the opportunity was updated, whether there was a certain level of activity by a salesperson or not, how much time it took between each activity, and much more. 

But in the systems of analytics you would typically just store what the opportunity looked at in the end, in other words, what the opportunity looked like right before it was closed, which would be very different from what the opportunity looked like a month before the close. At best, you take several such data snapshots and calculate metrics like the number of times the opportunity was updated.

A system of operation will only keep the state as of the last interaction. It’s like seeing only the most recent frame of a movie. A system of analytics will take periodic snapshots of the movie – a snapshot on Monday and another snapshot on Wednesday, for example. The entire movie is not seen by either system. 

But a system of integration like Boomi actually sees the entire movie. It can join transactions between two different systems. And when you add Aible on top of a system of integration like Boomi, Aible gets insights from the entire movie, not just from static snapshots. You can later go back and train your AI based on what a sales opportunity will look like a month before the close, or the beginning of the quarter. Because Aible analyzes the entire movie, you can look at it in different ways and the AI benefits from seeing the movie in its entirety.

Third, the data in the system of analytics is too clean.

The quality of data in the analytical system is typically much “cleaner,” but this is actually not a good thing. That’s because when the AI is trained from the system of analytics, the AI is based on artificially clean data. When the AI goes into the real world and tries to make a prediction, the transactional data looks very different from the artificially clean analytical data.

All the time and money businesses typically spend on getting data as clean as possible is actually wasted because the clean data is unrealistic. The clean data doesn’t at all resemble the transactional data, so the AI struggles. The true goal of AI isn’t to make training data cleaner, or to help the model ‘fit’ better, it is  to help the AI do a good job when it makes predictions based on ‘unclean’ transactional data. Thus, our focus needs to be on ensuring the ‘training data’ is representative of the data the AI will be used to predict on.  

The number one frustration Data Scientists have is that very few of their models are ever actually operationalized. Even after they have spent months creating a fantastic model,  IT has to translate that model back to the systems of operation. Whenever you create a model in an analytical system and then back-port that to an operational system, you run the very real risk that critical information is lost. The documentation of the process is never perfect, and human error can always occur with that manual mapping back and forth. 

This complicates the model deployment, and until the model is operationalized, Data Science can’t deliver business impact from the model they created. Moreover, if the deployed model is trained on unrealistically clean data, it can perform poorly when it sees real world data. So even after deployment it fails to produce results. The easiest way to cut this Gordian Knot is to train AI right from dirty or rather realistic operational data and deploy it back right into the operational systems you trained it on.

But, you can’t really train AI on transactional systems, can you?

The one thing we know about AI training data is that you often need to join different datasets from different systems and calculate different ‘features’ that help the AI make better predictions. Transactional systems can’t do that. Also transactional systems typically can’t run the kinds of queries you need to use to train AI. 

Dell Boomi and Aible scalably solves these issues. IT already knows how to use enterprise integration solutions like Boomi to tie together transactions from different systems. Aible listens in on these transactions and creates an AI training dataset in the customer’s own cloud account without ever gaining access to the data itself. It automatically cleans the data and creates data features but it also creates corresponding code for data cleansing and feature creation that is deployed with the final trained AI. 

Aible trains AI on that data and deploys the AI back into the customer’s own cloud account. It ties the predictions back via Boomi so future transactions of the same type have predictions and recommendations associated with them. It even adds explanations for these predictions into the Boomi integration. Any operational system with access to the Boomi integration can immediately leverage the predictions, recommendations and explanations. Aible even monitors the AI to confirm how good of a job it is doing and suggests when it needs to be adjusted to maximize business impact. Of course Data Scientists and Business users can review and adjust every step of the process from data to data cleaning/feature creation, to model training, to deployment, to monitoring.

IT and Data Science need to work together to deliver business impact from AI.

Because data science came out of analytics organizations, we’ve ended up with a world in which analytics teams control most of the AI process. But for AI to create value, its recommendations have to be embedded in transactional systems people use every day to make them actionable. That means IT needs to be core to the AI process – much more so than it is now. Now is a great opportunity for IT to step up and be a key player in the AI process in order to achieve the goal of every AI project – delivering business impact.

Comments