Supply Chain
Technology Podcast

EPISODE 31 | AI-Mastered Data Quality For Decision-Making

Shailesh Mangal

Vice President Of Engineering, Roambee

We discuss AI’s potential in enhancing data quality in supply chain for optimal decision-making. And also how AI and human expertise can work together to deal with the increasing complexities of supply chain. We round off the episode with some practical steps when implementing AI solutions for data quality enhancement.

We’re currently working to get the key takeaways for this episode. Stay tuned to Roambee’s Supply Chain Tech Podcast for all the latest episodes to build a more resilient and sustainable supply chain.

Roambee-Scott-Mears-Headshot-Event

Author 
Scott Mears
Senior Marketing Manager   

SUMMARY KEYWORDS

AI, data quality, supply chain, decision making, machine learning, data collection, data integration, data validation, predictive analytics, data sharing, data privacy, human expertise, data processing, actionable insights, data sources.

SPEAKERS

Scott Mears, Shailesh Mangal

 

Scott Mears  00:00

AI will always need a human to make sure it is doing its job correctly.

 

Shailesh Mangal  00:04

Thumbs down.

 

Scott Mears  00:05

AI is able to consume data faster than a human.

 

Shailesh Mangal  00:08

This will become true.

 

Scott Mears  00:09

AI will improve data quality in all supply chain and business?

 

Shailesh Mangal  00:13

100%

 

Scott Mears  00:20

Welcome to the supply chain tech podcast with Roambee. Scott Mears here, Global Field Marketing Manager at Roambee and one of the hosts of the Supply Chain Tech Podcast. We thank you for joining us today. In this episode, we speak with Shailesh Mangal. Shailesh is the Vice President of Engineering at Roambee. We discuss AI’s potential in enhancing data quality and supply chain for optimal decision making, and also how AI and human expertise can work together to deal with the increasing complexities of supply chain. We round up the episode with some practical steps when implementing AI solutions for Data Quality Enhancement. Welcome Shailesh to the podcast. It’s great to have you on the podcast today.

 

Shailesh Mangal  01:02

Thanks, Scott. It’s a pleasure to be here.

 

Scott Mears  01:05

Fantastic. Now. I’ve been looking forward to speaking to you about this subject for quite a while now, and really what we’re diving into, as you heard in the intro to this podcast that we’re going to be diving into, how can we harness the AI potential to really enhance the data quality and supply chain for optimal decision making. So I’m going to get straight into it and ask you Our first question today, shylish, and the first question I have for you is, is, is, how can AI technologies be leveraged to improve data quality and supply chains, and what specific AI driven tools or algorithms are making a significant impact on this aspect.

 

Shailesh Mangal  01:50

Certainly, I think the starting of any journey in AI has to begin with the data. AI certainly provides a myriad of tools today for us to make sense out of this data, but the journey has to start at the data, and that, very interestingly, I think, brings our initial, or very first, foray into how do we collect this data? The premise of basically, you might have heard this term, the garbage in, garbage out, is very true. So this has to start from finding the relevant data, finding the right data, finding right mechanisms to collect this data, and then I think the different techniques of data science, machine learning algorithms that helps us clean this data, and then cleaning can happen in removing basically the noise inside the data, finding the relevant pieces, categorizing it, putting tagging this data, and then boiling it down to a place or in a manner where this data can be consumed, we can observe this data. So such techniques, I think, span from very old, classic ATL technologies, where data can be crunched, down to more modern techniques where we can start looking certain patterns as they emerge, and use these patterns to train certain models in machine learning and AI to then drive towards a specific goal. Very important aspect of this is the observability of this data as it is happening in a continuous stream, and then to combine this and aggregate to drive towards a certain goal, a certain result, in context of supply chain, this can span into multiple dimensions. For a for a classic sort of transportation business, it’s managing your lanes, managing transportation transporters, managing your starting positions, the different delays that happen, either something that can be controlled or something that is beyond your control, just knowing that something downstream has potential to fail, and then looking at all those items as risks, basically, are some of the examples that can be used to justify the need of this data coming To algorithms. I think, I think we are. We have a lot that has been invented today. I think we’re really, really fortunate that so much innovation has happened in this area. I think the algorithms largely depend on what use case you look at and how you apply. But some of the basic ones that come to mind is simple clustering techniques take this above the basic premise of, let’s say, linear regression, going into maybe SVM, type of algorithms, when you look at the clustering basically, KNN can give some decent results. And if you go a little bit further, you can use DB scan or HDB scan, type of grouping or clustering algorithms, and then a whole slew of other learning mechanisms. For example, Naive Bayes, I think we found it very useful. If you go further to basically random forest or ensemble learning can be good decision making algorithms that can be applied to this data.

 

Scott Mears  05:27

I really appreciate that answer, you know, and I appreciate you giving the detail of the different type of algorithms out there that you can use, you know, in supply chain. And it’s really true what you say. There’s, you know, garbage in, garbage out. I think that’s so true and something so important to remember when starting to leverage AI in our supply chains today, and moving on to our next question is, data is collected from various sources in supply chain operations. But how does AI help in automating data collection, integration and validation to ensure the accuracy and completeness of the data.

 

Shailesh Mangal  06:08

Yeah, I think one thing that is very important when it comes to different sources of data is, or the ultimate result is, the trust. How do we take all this data that comes from different sources and convert this into an intelligence that can be trusted. And it is extremely difficult and challenging problem to bring AI to a point where where people can trust in it, and this is the journey for this actually starts from knowing the different data sources, their granularity, the different levels of fidelity of this data, how this data plays a role, how this data is collected, what is the degree of error that may get introduced just by the mechanism of this data, the immediacy of this information. Is it being collected as it is happening or it is coming to the systems post the fact. And how much post the fact does it come in all data has value, either historical or current, but knowing this level of fidelity plays a big role towards creating decision making systems in this. The second aspect of this is then to take it from here and then to appropriately tag this data to play a role in your decision making. So very simple, layman example I will take here is, let’s say you collect data. One signal comes directly from the sensor, and then you’re collecting it in the real time. The other data stream is coming to you, let’s say, a human scan, and it get collected in some kind of a online system before it gets distributed in a batch fashion. So there is a there’s a time lag in between of the two. Each of this data basically have a different level of fidelity, and your models, as you build them, should have a consideration to give appropriate weightage to each one of them. Now you can start with with an assumption, but I think the best system would be to sort of play this in a feedback loop and then to come up with the right weightage for different parameter tuning that needs to happen to ensure that the level of accuracy and a completeness is maintained throughout this exercise.

 

Scott Mears  08:27

It’s interesting that answer, because I think you know, as people look to AI as this new, very attractive tool, and you know, not all of us have a full understanding, is still trying to get a grasp on what it can offer. Is what you’re telling us there is that you really need to use it correctly. And it’s not just the magic tool that can be implemented, you know, with no guidance. You know, you need to know what what you’re looking to use it for, and understand how to integrate that correctly. So I think that’s really interesting, what you’re what you’re saying there, and I’d love to hear if you could share some examples of how AI driven data analytics can detect anomalies and inconsistencies in supply chain data, and how businesses can use this information to make more informed decisions.

 

Shailesh Mangal  09:16

Right? So I think very interesting question you bring up, Scott, I think, and I’ll probably take a step back before I get to the AI. I think the need for making such inferences and driving certain decisions of the data has been of Paramount interest. So in the traditional manner, people have tried to resolve it by putting certain kind of guard rules, like we call them threshold values. For example, if it exceed a certain level, tell me about it, right. If it goes beyond certain point, tell me about it with AI. I think this level of complexity, call it or call it dependency on human to put these guard rails, goes away. AI can be the. Trained and in very unsupervised manner to learn from past and as humans, we can, we can teach or tell AI what is acceptable and what is not acceptable. So we take the problem from it being a very stepper function, so to speak, to a very humanistic where a single value, or crossing a single value doesn’t mean anything. In this unsupervised learning manner, we can take the premise of intelligence to the next level, because our values are not hard coded. They are not fixed, essentially, they’re not set by any human. It is automatically deciphered through training. So an example that I would give the previous world and the current world and the future is we used to draw the geofences. Geofences and concept that Google Maps made it very popular in last decade, I would say, or last, last decade, and since then, basically people are very fondly draw these very interesting polygons around certain areas of interest and observe around them, right? What we have noticed by use of technology is that geofences are quite notorious for a variety of reasons. Basically, one is there is a inherent inaccuracy built into the system of GPS where we depend on how many satellites you can view, and your GPS signal carries a certain level of inaccuracy. One second is, let’s say even this, the signal is very accurate. There is a human aspect of it. An example would be, let’s say we are monitoring a warehouse, and warehouse is used to collect all the packages, pallets and containers, and then warehouse is running for we have found in practical usage that people start using space outside of the warehouse. Sometimes the container size is pretty huge. It’s perfectly all right to leave it by the wall, outside of the geofence. And in such cases, a traditional form of data evaluation will basically mark the presence of container not being inside the geofence, and hence you will stop recognizing that container is even there by the use of elastic geofences. Elastic geofence essentially defies the notion of a fixed geofence. It essentially draws the boundary based on the presence of signal over a period of time. And it keeps its boundaries quite elastic, right? And you can sort of imagine a fuzzy circle that you can draw around, or a polygon that you can draw around which is not a fixed size. Basically, it’s a, it’s a, it’s a, it’s a flexible boundary that automatically get adjusted based on the usage of the land, not only around the building you have, but beyond it, right? And this notion can be extended to multiple different areas that we can look at. This could be applied to a fixed range of temperature settings. It can be applied to shock levels. It can be applied to all kinds of condition monitoring that we see where we go beyond the boundaries, and we basically look at the trends the past data, and just not limit our imagination to what we fix and observe. But then they rely on the AI to find the pattern, and as the patterns are missed out, they get recognized.

 

Scott Mears  13:25

That’s yeah, I really appreciate your answer to that question and a lot of depth there and bring a lot more clarity to this subject as a whole. And I do encourage you as well, if you know we have any you know examples, I appreciate, you know, the great examples bringing out in the supply chain visibility. And you know, if we’ve got any other examples in other companies, that’d be great to learn more about as well as we go on. And with the complexity of global supply chains, data can be very overwhelming. And how does AI assist in data processing and presenting actionable insights to supply chain managers and decision makers.

 

Shailesh Mangal  14:10

You touch upon a very interesting and very contemporary subject. I think that data consumption is probably one of the largest challenges of today’s world, there is so much data that we generate, I think it’s order of magnitude more data. I think some classic sort of comparison, there’s probably more data in one iPhone today than the entire world had in 1950s 60s. So you can imagine, basically, the speed at which data is growing. What the challenge it presents, essentially, is, how do you how do you consume this data? And I think AI is quite interesting in the sense that it can humanize the consumption of this data. And that’s probably one of the biggest I think it’s. Changes that we are seeing this year compared to the previous years, where the invention of transformer learning and llms is making where machine learns about the data humans only get their questions answered. So today, we are not enamored by this massive explosion of the data and going forward, the systems that we’re building basically will not have standard dashboards. Nobody’s throwing data on your face for you to consume. People are very much capable now to get answers to their questions, rather than having to consume reams and reams of CSVs or Excel reports or emails for that matter. So I think one of the big areas, I think that innovation is happening, and speed of reduction in consumable data is happening, is to with the help of AI, we can ask a specific question. We can turn those questions into dashboards, or we can have very nebulous questions. An example there would be you could today conceptualize a question, oh, I’ve got 70 different lanes running five and over 500 shipments, which are the most important shipments that I should be worried about? And you can totally ask this question today and to expect a very accurate answer of the 500 there are 13 shipments that are running high risk of either delay or have are running in high risk areas, or have gone into some kind of an excursion that you should be looking at, right? So you can use a very communicative and conversational manner to consume this data, and all of this can be possible through our journey in AI and being able to distill it down to what is the most meaningful. And then now you can take your conversation and then turn this into a repeatable, auto generated response that can be delivered in the way you want to consume this information, down to the level of a few lines of tweet or, shall I say, threads.

 

Scott Mears  17:15

I think that’s really the piece that excites me so much. Is that the sheer speed in which AI can give us these answers, of to allow us to make these informed decisions as business owners and running our supply chain is what really excites us. And I guess we’ve seen that on the fun side of chat. GBT, you know, and everyone’s asking, you know, fun questions in their lives, and we’re seeing sort of the reality of what that could look like, and knowing that that can be, you know, you can have a version of that in supply chain. I think is super exciting for the supply chain management of the world. I want to know now is predictive analytics. Analytics is really gaining traction, and is a big is a big piece that we can predict more and more within supply chain, and I want to understand how can businesses utilize this technology to anticipate potential data quality issues in advance and taking preventative measures to deal with this

 

Shailesh Mangal  18:14

right? So I think the foundation of the future can be put on the the learnings of the past, right? So I think when we look at future, I think it’s very important to first understand what past has been, what is the driving criteria for the past was, and from there, basically, I think knowing the business right? And I’m saying in a very sort of quantitative manner, does today’s business owners know their business in terms of numbers? Can you boil it down to few scores that you can you can track, knowing your business metrics what really moves the needle for you, and make that as a central focal point of your attention and and driving your entire department to meet your criterias in those business metric that is the starting point of getting or building a successful predictive model. Because once you have things that really matter that drive your business metric, now you can start sort of looking at what the future will look like. So if I know my lane velocity, I can then look at, how do I improve my knee Lane velocity? If I know my transporter matrix and transporter effectiveness score, I can then work on, basically finding out the right transporters, and push my transporters to deliver on time, deliver in full, deliver with high quality. So it starts from knowing what the business is, where we are today to then plan for the future. Once we have this foundation laid out, the future becomes quite simple once we have these KPIs, the machine learning models today are capable of taking all this data in identify the different level of causality. Policy and identify what is impacting these metrics, and then to predict how this will behave going forward. Yes, it requires a bit more work in terms of taking care of seasonal requirement or changes weather patterns that can impact the supply chain, specifically demand patterns as they change from holiday time to non holiday times. And then once you have, let’s say, two years worth of data, all of this can then be trained into generating models that can help you predict how the future will look like. And not just that. You can actually run some fictitious models and make changes in the past data. We call it basically synthetic data. You produce synthetic data to see the impact of this in future. So not only you can know the future, you can actually have the power of changing the future. Sounds very, sort of surreal, but it’s definitely doable from our perspective. And last point is basically, I think, align yourself with industry standard KPIs that are available, we very strongly propose the SCO our score model, which stands for supply chain operational statistics that world basically aligns to. If you have these kinds of comparison points, not only you can know your business, you can compare it with the similar industries, similar businesses, and how effectively and efficiently those are run. And if you’re interested in sort of more deal detail of this, this is something of an area of interest for us, very, very deeply, because that’s what we do at roombi would be happy to discuss and go in more details in this area.

 

Scott Mears  21:40

It’s so fascinating, as I hear you there, especially what the piece that I really found found fascinating there was that you’re actually using the past data to actually predict the future, where you can actually change some of the data points to see how that would affect the future. I find that that’s a really interesting insight into the level of depth AI can can bring us in the future. I think that’s so fascinating, the how it can provide us those future insights. And coming to data sharing among supply chains, bringing data sharing among supply chain partners is essential for collaboration. We see more and more partnerships in supply chain. We do it run be more, you know, collaborations to bring a more rounded and more full picture for our customers and making it easier for them. How does AI address data privacy concerns while enabling the seamless data sharing to improve overall data quality.

 

Shailesh Mangal  22:45

Yeah, I think another very interesting points. And specifically I think platforms like rumbi and others who have this layer or access to data from different players, I think makes it very interesting and challenging at the same time. I believe, I think privacy concerns are very important. We take it very seriously. And they are. They need to be taken seriously. However, if you look at the problem statements that different industries face, or different organizations within the same industry, types face are, are not very different. If there is a congestion at an outbound port in LA, it will impact everybody that is shipping from from that port. So as we learn, we sort of anonymize many of these models, and the data can be shared across without having to reveal any information that is sensitive to that business. We don’t really need to know what you are transporting, for example, what are the container is carrying, which is very, very sensitive. But the challenges that you’re subjected to are very shareable. So our learnings as we make them, be it at a lane level, be it at the transporter level, be to the weather level, are pretty much specific, around GEOS, around industries, and those challenges can be shared. Ai essentially makes it very interesting. We just need to categorize this data by industry types or the lane types, and it makes sharing that knowledge, that information, super easy. We can apply similar type of challenges for same types of things that they really separate them out. So, for example, a challenge of shipping with a vessel are same or similar to anybody who’s using that vessel. Now you come from from a pharma industry, or you come from a supply chain industry, from an auto industry, the challenges will not be very different as long as the mode of transportation is same, as long as the lane is same. So lot of this commonalities can be extracted using these different models that we have, and the common aspects can be applied according to how this entire journey fits in. We. Even have intelligence today to basically split the different parts of this supply chain and pick the pieces and only the common pieces can be clubbed together to drive to a better analysis overall, and then to create a completely new journey, or new intelligence on top, where each part is actually irrelevant. It doesn’t matter where it came from, as long as these pieces, basically, we use them to collect data from multiple lanes, multiple customers, multiple industries, and so on. So AI gives us this level of granular control over driving towards a decision that can benefit everyone.

 

Scott Mears  25:41

That’s fantastic to hear. And I know the listeners hearing this now is the writing down on the with the pen and paper, lots of notes at the moment, when it comes to the human element with AI, it’s always an interesting conversation. You know, how long will human and AI, you know, operate together? You know, how much does AI need the human touch? So when it comes to how can AI and human expertise work together to create a robust data quality framework in supply chains,

 

Shailesh Mangal  26:15

I think human is one of the most important parts. I mean, as we see, AI has a lot of promise, but this promise basically comes from the fact that a human is in the loop. They are consuming this information, and this creates a very interesting and very important feedback loop. So to speak, the ultimate of this information is to make informed decisions in future to make the future better. So this drive towards relevance of the data, and again, humans will will basically help AI to learn what is relevant and what is not relevant, how this data is presented, what is the accuracy of this data. And again, providing this feedback, AI can learn from from this closed loop interaction, and can continue to keep making itself better. Applicability, right? We looked at the relevance. Now the data is relevance, how applicable it is to make a decision. How in time are we providing this data so that the action can be taken right away, or action can be taken in form of planning for next quarter, next year and so on. This closed loop control and the involvement of human is what makes AI lot more non artificial, I would say,

 

Scott Mears  27:33

interesting. My final question to you is, really, before we have our little fun segment at the end is, is, what are some practical steps that companies can take to implement AI solutions for data quality enhancement in their supply chains, and what potential challenges should they be prepared to Address?

 

Shailesh Mangal  27:57

Well, I think it starts with making data as your centerpiece is where the whole thing starts. So adopt practices that drive towards collecting data. Right? What you can’t measure you can’t control, right? Although it’s not 100% true, but there is some truth to that, that you need to at least make it measurable. So invite anywhere you have an opportunity to digitize, to collect any form of data, then from there, you drive the parts of this data that really make a difference. What is it that you really care about? Is it on time? Delivery? Is it on time? Receiving? Is it completion and fullness? Is it the quality of your product? Is it the recalls that you are having? What is it that? Is it the transportation cost? There are many, many such ideas that are there, and some of them you already have. Everybody probably has some ways they figured it out, right? But it’s not the complete picture or entire picture. So get that practice. Get started from there, and it’s hard. Basically, there will be certain data sources that are not available or are not accurate or are quite delayed. So then you find newer ways you if you don’t have a signal that you can get data from you, then create that signal. I think sensor cost we have seen in last decade has significantly reduced what used to be for hundreds of dollars, basically is available for pennies today. So if you did this exercise five years ago, I think it’s time to relook at that. There is no harm in generating your own data. That’s one second is you don’t have to digitize everything. As long as there is enough samples that can be done, this can be lot more cost effective than you think. You can bring in this data. So your first challenge is to start collecting this data. One second is employ some type of ability to drive this data to certain decisions that you can have. It could be in the form of a place where you can ask questions. It could be a simple dashboards. Could be a report, but something that that you can crunch this down to. Now your job will be to convert this data into standard KPIs. There are certain ones that that we have very openly talked about. There are more that industry has, any what. Any of these basically that you can pick but have something that you can define and refine your business with, and then ultimately, feel free to sort of break away from the mold of we’ve been doing it for last 30 years. If it broke, why fix it? I think there is a need to keep innovating yourself. Use this data to drive your sort of monthly meetings, drive your teams to deliver and you’re measuring it, the effect will be lot more clear, lot more in front of you. And AI can certainly help by reducing the time to get you the data. Once you know what data you need, the challenge that you will face, basically, is both at the collection time, at the crunching time, and then finally at the adoption time, is your teams basically open to taking in this type of feedback. It will take some time and training, but it’s definitely doable, and the results will be amazing. I think you’ll you’ll realize that what you thought are the places for inefficiencies, are actually not the places for inefficiencies. Or you have bigger banks for the buck where you can make minor changes and get lot more done from that. At the end of it, if nothing else, you will at least be lot more informed to make certain decisions. If I cut this one lane, what will happen if I remove this part, what will happen? If I add one more over here, what will happen? Or where are we spending most amount of our time? Some of the basic questions that that we get asked on a daily basis, and we see customers getting surprised constantly around these areas,

 

Scott Mears  31:57

I think, I mean, there’s been so many learnings during this episode. But if anyone was to take that one learning from this episode, I feel like it was, it was that answer right there. It’s really, you know, you’re going to be surprised. And it really is here and now to and not to be stuck in those old traditional ways of you know, it’s not broke. Don’t, don’t fix it. So let’s, let’s move on to our fun little segment of thumbs up thumbs down. So what I want to bring you into Charlotte is I’m going to hit six questions with to you that just yes or no answers, and then if you could just give me a thumbs up or thumbs down, and also, if you could say it for the audio listeners as well, that would be great. So let’s, let’s give it a go. Are you ready to go for this? Yep, let’s go. Brilliant. So Okay, number one, AI will always need a human to make sure it is doing its job correctly.

 

Shailesh Mangal  32:57

Comes down. AI can do it on its own. Interesting.

 

Scott Mears  33:00

AI is able to consume data faster than a human.

 

Shailesh Mangal  33:06

At some point, this will become true.

 

Scott Mears  33:09

AI will not miss an anomaly in data

 

Shailesh Mangal  33:15

That’s a tough one. Thumbs down. It can always make mistakes.

 

Scott Mears  33:19

AI can see opportunities in data that humans cannot.

 

Shailesh Mangal  33:24

That’s 100 100% thumbs up, even though humans probably can. It just too much for them to to sort of go through everything. So yes, they can miss. The likelihood of human missing something is lot more than than AI will.

 

Scott Mears  33:37

Wow. AI will drive more collaboration between stakeholders?

 

Shailesh Mangal  33:42

Absolutely, thumbs up.

 

Scott Mears  33:44

And finally, AI will improve data quality in all supply chain and business 100% Wow. Well, I know I’ve got a lot of learning there, and I know the listeners and watch as well too. So thank you so much, shaish, it’s been a great episode, and I really appreciate jumping into AI more deeply with you.

 

Shailesh Mangal  34:04

Thanks, Scott. I really enjoyed it. I think your questions were right spot on. I think this is this is our future.

 

Scott Mears  34:11

Great. We’ll just do a little wave together to the listeners. We’ll say thank you very much, if you just throw your hand up there and say goodbye, and we’ll say goodbye and see you next time. Bye, bye. Thanks everybody.

 

Shailesh Mangal 34:25

Thanks for watching. Thanks for having me here. Thanks, Scott.

 

Scott Mears  34:29

Thanks. Hi. My name is Scott Mears, and I’m one of the hosts of the Supply Chain Tech Podcast with Roambee. On this podcast, we talk to supply chain heroes from around the world about everything, ranging from the disruptions related to supply chains, their personal experiences with tracking technologies, strategies to build resilience, and much, much more. We already have some recommended videos for you to the side of me, and if any of this sounds interesting to you, do subscribe to our Youtube channel and hit the bell icon so you don’t miss another Roambee video. I’ll see you next time you.

Monthly Episodes: Everywhere You Listen!

Don't Miss Out: Get the Latest Episodes Delivered