AI Agents

In this week's #TechTalk, Amit and Rinat dive into the world of AI agents. They explore the rise of AI agents, their potential applications, and the revolutionary changes they bring to various industries. From simplifying daily tasks like travel booking to advanced project management and even self-driving cars, AI agents are at the forefront of the AI revolution. The hosts discuss the differences between generative AI models like Chat GPT and AI agents, address concerns about digital inequality, and ponder the future of AI in everyday life. Tune in for an in-depth conversation on how AI agents are shaping the future.

Transcript

Rinat: 00:00:15

Hi everyone. Welcome to Tech Talk, a podcast where Amit and I talk about all things tech. Today we're gonna talk about AI agents. I'm sure everyone's heard about AI and have used chat GPT and similar solutions and are excited about what's coming next. We wanted to talk about this popular topic because there is a lot of misinformation going around as well. And we also have our 2 cents about what the future holds, where the AI movement is heading, and what kind of product we can expect in the near and far future. AI agents is one of the particular topics to dive deep into for our listeners to know about. It is an important part of the whole AI revolution and I'm excited to talk about it today. So thank you Amit, for coming up with this topic. Tell us what is AI agents to begin with.

Amit: 00:01:15

Thanks Rinat for that great introduction and yes, AI agent is now currently in a boom stage. There are a lot of companies that are working on AI agents and I thought it's a very relevant subject given that we have now explored so many different aspects of AI and machine learning models. This is the next step where we discuss about AI agent. AI agents essentially are agents that do things on your behalf. So say for example, in your day-to-day life, you go to a travel agent and you ask the travel agent to book a ticket for you. You don't bother about searching the internet, you don't bother about looking at the flight.

Amit: 00:01:53

You just tell the agent, I need to go from this location to that location, find the best possible flight, given my constraints, and at the best possible price. And the agent will do all the work for you, and they'll give you a couple of options, and then you decide, okay, this is the best option. Please book the ticket, and the agent will then book the ticket on your behalf.

Amit: 00:02:14

This is how something happens in a real world scenario. Now, imagine, you give the software to do something, to perform a task, but the task is defined within an environment. So you tell the AI agent, create a website for me to create a website. It needs internet connection, it needs to host the code somewhere.

Amit: 00:02:35

It needs to write the code, and then it needs to deploy the code so you can actually see the website. But the whole thing is based in an environment. The environment is the internet, and then it uses certain tools to perform those tasks. The difference between a generative AI model such as a large language model like Chat GPT, and an AI agent is that Chat GPT will only generate text for you. It'll not perform certain actions.

Amit: 00:03:02

For the agentic behavior Chat GPT is now almost an agent because you can ask it to generate an image. It'll generate an image for you. If you ask it to generate a presentation, it'll generate a presentation for you with images in it. It can generate HTML codes. It can generate a website, it can do a lot of things, now. But when it was first released to the public, it could do only code generation. So AI agents perform certain tasks within a specific environment. The specific environment could be website generation, it could be image generation, it could be code generation, et cetera.

Amit: 00:03:37

So it could be a large language model talking to an API, asking the API to do something, getting the response from the API, doing some things on it, and then giving you the results. So this is essentially what an AI agent is trying to do. It's trying to replicate what we do in the real world. I've given you the example of a travel agent, but it could be any agent, it could be an event planner. It could be someone , who does project management. So suppose you want to manage a project. You want to create a plan. You can ask the AI agent to create a plan for you, given the project requirements and constraints.

Rinat: 00:04:15

he end of November, December,: 2022

Rinat: 00:04:33

But even before that those of us in tech have used pockets of AI technology in different areas when working and when large language model like chat GPT was released to public, we were all amazed and soon after chat GPT, there was Claude, there was Gemini all the big tech was releasing their own models.

Rinat: 00:04:55

And then a few months after that, Chat GPT released newer versions, which were more accurate in terms of the way they talk add more customizations, et cetera, et cetera. And then there was image generation, and then there was video generation. There's so many types of AI flooded the market with all of these amazing thing you can do. And then every six months or so, we've been getting more and more updates and we're getting more and more excited. This is such exciting field where innovation is just happening every month. It's like a really fast paced progress, and now we're thinking what's next? And as humans, we get bored quickly as well. So even though these were such revolutionary game changing solutions, we've changed our lives forever. The way we do anything on the internet changed overnight, but still we are just like, what's next?

Rinat: 00:05:52

And why can't it do this other thing? , it responds as if it knows. We know that AI doesn't really know or, comprehend what's happening, but it can generate usable responses and only if it could translate into actions. And that's the one thing that has been missing since it was released.

Rinat: 00:06:16

The first thing people saw was like, yes, automation is. Revolutionary AI is revolutionary, so why can't we combine the two? And agentic automation is one of the new terms that's coming about where AI agents are doing various tasks that humans would do. And now with these large language model you could think that it could handle a lot more complex tasks that wasn't possible before, or even if it was possible, wasn't really viable, to get a proper ROI in time, et cetera.

Rinat: 00:06:50

But now it seems like it's so close, it just needs that one connection. And agentic AI or AI agents can bridge that gap that everyone's waiting for. We have nowadays you can't really call Alexa or other smart devices AI because AI has come so far, but that kind of smart automation systems can do things in real world, can change things in real world, it can turn on my light and do other scheduled automated things.

Rinat: 00:07:21

Now, if that was somehow connected to an AI system. It could be large language model, which can interpret what needs to be done. Obviously being trained with language, it knows what sentence should be next, but I don't know how. But being trained with potentially actions or responses, it could maybe understand what action should be next. That's just my very oversimplified version of what AI agents might do and might be trained on. What's your thought on that, Amit?

Amit: 00:07:59

You've hit the right points. Earlier they were just doing generation or they were just searching the internet, but now they're performing actions. And that's what the AI agent is. It is using the AI tool to perform certain things. Let me give you an analogy. Say for example, you call a gardener to your house and you ask the gardener to do some things in your garden. the gardener, will get some tools to perform those things. Now the gardener may not be good at, say, sawing a wood with hand, but with the power tool, he can do it much more quickly and he doesn't need a lot of a lot of physical strength to do it right. So with a garden tool, he can quickly ax a tree. Then with a hedge cutter, he can quickly, he or she can quickly trim a hedge. Then with a lawnmower he can mow the lawn very quickly. So these are certain tools that help the gardener. The gardener has in his head a plan on how he wants to sort out your garden, but the gardener doesn't have a predetermined algorithm.

Amit: 00:09:00

So now let's come to the software or digital world. In the digital world, if you look at things before AI, we had to write an algorithm. If this happens, do this. If that happens, do this. If you don't see this, don't do this, et cetera, et cetera. With AI, you can give it anything and it'll come up with a good enough response.

Amit: 00:09:21

So it means you can now cater to any sort of combination of prompts, so any types of input you can deal with. So now you don't have to write algorithm, you just have to train the model into one thing and then give it access to a tool. Now we know that large language models are not very good at say, arithmetic calculations.

Amit: 00:09:38

So suppose you tell Chat GPT to do some calculation, you can give Chat GPT access to a calculator. So Chat GPT will now say, okay, Amit has given me a task. The task involves doing a mathematical operation. In order for me to do this mathematical operation, I need to do this. And as part of that, I need to contact a calculator, give the input, ask the calculator to do the calculation, then send me the response, and then I give the response back to Amit and then Amit checks whether it's right or not. So this is what any agent is trying to do. So instead of trying to do everything itself or trying to train the LLM model to do everything itself, you have different tools.

Amit: 00:09:45

So Chat GPT contacts Dal-E to generate an image. We see inside the Chat G PT interface that it's generating an image, but it's actually using Dal-E. If you ask it to create a video, it is using SORA, the open AI model for video generation. So similarly, the different AI agent tools can use different APIs so they can use a Google API to perform certain things, et cetera, et cetera.

Amit: 00:10:42

So that is what AI agent is trying to do . Now, if you break down what it's trying to do into simpler steps, it's essentially trying to first think what the problem is, trying to come up with the steps in order for it to solve that problem. Then it wants to execute those steps. And once that steps are executed, it wants to validate whether the execution has happened or not. Essentially, this is what happens. So suppose you go for an AI website generation tool. You give it a text prompt. Say I want to generate a website for a vegan shop. So it'll first think, okay, what does a shop need?

Amit: 00:11:21

A shop needs a homepage. A shop needs some menu items. A shop needs a cart, a shop needs a checkout section. A shop needs this. So it has now come up with a set of things that it needs to do. Now you can ask someone to validate it. It could be a human, or it could be another AI agent saying, okay, let's validate what steps you have come up with, and if there are some unnecessary steps, let's delete that.

Amit: 00:11:46

And whatever steps is possible, we will only execute that. So once the validation is done, it'll now execute all the steps. And it'll do it in sequence, or it can do it in parallel, and then it'll give you the whole website and then it'll also give you the code and then you can see the website.

Rinat: 00:12:04

So this is all done using an AI agent. Now, I've not written an algorithm for it. I just gave a text prompt. I need a website. It came up with all the things that a normal human would do. So if you ask a developer to do the same task, they would come up with, okay, what do I need? What images do I need how the structure of the page should look like, whether it's responsive or not. So it has now thought of everything by itself, and that is the power of an AI agent.

Rinat: 00:12:32

You've touched upon this part where when it needs to generate image, it contacts Dal-E but those are all within the open AI ecosystem. But. Now if I ask it to just do a Google search or even just as you as per your example, do a calculation using a calculator, not relying on the large language model, but just use a calculator and do a calculation. Now that the calculator app within Windows, those are outside open AI ecosystem. So I have to give certain permission to this entity, whatever it is you wanna call it, an agent of open ai, I have to give them some permission with which we don't really have control. And it will, whatever it does, it will do it so quickly that before I realize, and it might decide that, Hey, why don't I do the calculation by seeing , what the bank balance is, and then I can send a particular sum of money somewhere else and see what the balance is remaining. And that would be the answer to your calculation. And that could be a completely different, quite a devastating outcome. First of all, we can see that a large language model, using an API within OpenAI ecosystem generating similar data and outputting, that's totally different than actually going into a website within my operating system and clicking buttons or executing a script or whatever, and then getting the result. How would that happen? And that's one thing. How, and then what is the impact of it getting it wrong? Because it's gonna happen so quick that I wouldn't have time to tell it to stop.

Amit: 00:14:17

So initially, when the AI agents came into the picture they used to generate a plan and then a human could validate the plan and then say, okay, execute this plan. And when you do the validation, you can actually modify the plan itself. So you, if it comes with 10 steps, you can say, okay, remove step this, step that step eight or step nine, and then perform the remaining steps.

Amit: 00:14:36

So it'll have humans to tell it what to do, but now it, it is automatic. Now when it comes to the question of, what's there on the web, what access I need to give. So think like this. LLM models are trained on a specific data set. AI agents are trained for a specific environment with a specific set of tools.

Amit: 00:14:56

So the environment for an agent, which is sitting inside a self-driving car is the world around it, right? And the tools that it has is the camera sensors and the driving pedal, the steering wheel the accelerator, et cetera. So these are the tools. So based on the environment, it'll assess what I should be doing, and then it'll make adjustments with the tools that it has, steering wheel accelerator, et cetera. Now, it has been trained only for that. You cannot expect that self-driving car to suddenly learn how to operate in water. So it's not a speed boat, right? So similarly, when we say an AI agent to do a calculation, it'll do it via web browser.

Amit: 00:15:41

Currently if open AI and open AI currently has access to the web browser. So basically if you tell it to go to a URL and summarize it, it'll do it for you. If you say, gimme the output of this mathematical thing by using Google search, it can go to Google search, and then Google can do the calculation and then it can use its response.

Rinat: 00:17:07

So this is where I want to understand, that's actually a very good analogy or an example that you gave in terms of the environment of that AI system in a self-driving car. It has a particular set of input and a particular set of output that only those are the ones that it can do. That's all fine. Now, with the web browsing part, I'm just thinking that how viable would it be because what we don't usually think about is a web browser like Chrome or Firefox or even Edge. They're all actually desktop application. They are not internet. You access internet from these web browsers. So if I ask chat GPT or Claude or Gemini to do a Google search and tell me the best result or summarize it, it doesn't live in my desktop. So it seems like Cortana is probably ideally suited to be an agentic AI robot that chat GPT needs a lot more environment control that it doesn't have at the moment.

Amit: 00:17:24

But you forget that most of the websites can be accessed by the Curl command on a terminal. So if you use the Curl Command, it'll download the HTML, and the HTML is a text file and it can go through the text and it can read it. Of course, it's a JavaScript, then it has to render it and so forth.

Amit: 00:17:27

I don't know how the mechanics of it works, but I'm pretty sure that it can do it in a manner in which the code can understand and interpret. And then give us a output. I'm pretty sure it's not going to open a browser because when you ask Chat GPT to do something like that, it doesn't open a browser, go to the URL and then go through the whole thing.

Amit: 00:17:44

It takes a screenshot and then analyze the thing and then give you an output. No, it happens instantly. And the other thing is you have to understand when you give when you give a prompt, or when you give a set of inputs to the AI agent, the IG agent has to do the task very quickly. And it can go in any direction.

Amit: 00:18:00

It can have, it can go in direction A or direction B or direction C. It has to come up with the best possible direction so that it performs the task in the most optimal fashion and in the most correct way. Now, the thing is, if your prompt is very specific, it can do that. But if your prompt is very generic, it can actually give you an output that you don't want or you don't need, or maybe it's incorrect.

Amit: 00:18:22

So the more specific your prompt, the more specific it'll be to generate an output for you. Otherwise, it'll, it can go in any random direction because you're right. If I give a AI agent an access to a browser, it can go anywhere. It can do anything. But if I give it very specific thing, then it can do the task efficiently for me, otherwise, it is going to waste my time.

Amit: 00:18:44

Similarly, it's just like talking to a travel agent. If I tell a travel agent, book me a holiday. In a warm country, you haven't specified the country. You haven't told how many days. You haven't told your budget, you haven't told what time is suitable. You haven't told the month of the year you want to travel.

Amit: 00:19:01

You haven't told how many people are traveling. If you have not given all these inputs, the AI agent is, the travel agent is going to give you a response, which is not suitable for you. So you have to go back and forth with the agent, right? Even though the agent is smart, it can do a lot of things, but because you haven't given us a very specific set of inputs, it is going to give you a very vague answer, which is not useful.

Rinat: 00:19:24

And we don't want that with an AI agent. The whole idea of using an AI agent is you give it a very specific set of inputs, so you get a very good response. Otherwise, you'll waste a lot of time.

Rinat: 00:19:36

This reminds me. It's only slightly relevant, but it still reminds me of a time when we were dealing with an AI technology. This is before the era of large language models. This is where an AI system was being used. I was trying to implement a system where we do handwriting recognition. We've built it, we've near implemented it. It's on a test stage and then, there was conversation between the tech side and the business side as usual. And the conversation was then what? Why is this not a hundred percent? And there's always like 90% or 95% accuracy and why can't we increase it. It's, because at that time people didn't know about how AI works as much as we do now. And they were used to the decision trees, like if then, so it was very binary and it was very like the expectation was they would be a hundred percent the expected result. But with AI we weren't getting that. And then the, my argument was that how, the business was saying that why can't it understand whether or not this is the number two ? Why is it not getting it 5% of the time? And then my argument was that would a human be able to always tell the difference? Because sometimes handwritings are so illegible that even humans don't understand each other's writing. And 95% accuracy is probably higher than humans handwriting when we are dealing with especially prescription. You have to think about where to draw the line.

Rinat: 00:21:02

Because if you give it less specific, vague information, you should expect a human-like response, and that's what you're gonna get. And that doesn't mean the AI agent is not smart enough, it just means that there is more information exchange that needs to happen. Yeah, this was a bit of a aspect of AI that started from even before AI was popular.

Amit: 00:21:29

But I, it's interesting you mentioned about this because this is the handwriting recognition is the done through a tool called, I think, OCR tools. I don't know the what's the full form of OCR

Rinat: 00:21:39

Optical character recognition

Amit: 00:21:41

character recognition. Yes, exactly. An AI agent currently can actually ask an optical character recognition tool because you don't have to train the LLM model into everything.

Amit: 00:21:51

If there is an OCR tool, you can say, okay, do this. So you'll contact the OCR tool. Of course, you have to first define the environment, you have to define what tools it has access to. So if it has access to OCR tool, it can do the character recognition, and then it can tell you what it is, and it can maybe do a translation or it can do a transcription, or it can do voice generation by reading out the text, right?

Amit: 00:22:14

So that is one aspect and I think this is already happening. The other example is gaming. So if the environment is a game, say for example chess. So you tell the AI agent the rules of the game, and then you just let it play. You don't define anything else.

Amit: 00:22:31

And the environment is the chess game. The tool it has is the the chess board itself and the and the rules. And then it'll perform the game and game is one example. And then I gave you the example of a self-driving car. Then you have chat GPT, then you have website generation, then you have travel planning.

Amit: 00:23:00

So with chat g pt, you can ask it to prepare itinerary for you, for say a family of two with a toddler going to a holiday in say, Japan for two weeks. And it can give you a travel itinerary. And in the future it'll be able to book tickets. But it needs access to a platform, say Expedia. So if it has access to Expedia, once you have given the itinerary and once you have given everything, it can actually go to Expedia. And book a ticket for you, right? Given all the information that you have. Now, suppose in the future you want to do shopping. So you want to say, I want to buy a 3D printer, and it has access to Amazon. And you say, this is the budget. This is what I'm looking, these are the features I'm looking for. I want you to go ahead and shop a 3D printer for me.

Amit: 00:23:39

So now it has got everything. It'll go to Amazon and it'll order a 3D printer for you. So this is now an agentic world. Now imagine in a enterprise environment, you have a customer service agent. Instead of the agent handling the task, you give the AI agent to handle the task. So say, for example, a customer complaints that they have not received the order.

Amit: 00:24:02

So the AI agent receives this as an input. The AI agent has access to the whole CRM database. The AI agent knows the order number, knows where the customer lives, knows, has access to the entire database can, has access to the real time order status and everything so it can repeat the information and tell the customer that, okay, we have processed your order and this is the current status.

Amit: 00:24:24

It is out for delivery. You should be expecting at this. Or it can tell that, okay, it has actually delayed, and so forth. And if it is delayed, you can ask the a agent when there is a delay, give a gift card. So all these things can happen in an enterprise environment, and this is just with support agents.

Rinat: 00:24:43

Now you can imagine sales calls. So instead of ringing customers, you can ask an AI voice agent to do that task and maybe ring customers or generate leads, et cetera. It's a whole new era where different things can happen based on what tools we give access to and what environment it operates in.

Rinat: 00:25:03

I'm just so excited about what the future holds when all of these things are also available to the public. We don't actually know whether how far the research behind the curtains have gone and whether this or this kind of technology already

Amit: 00:25:19

This exists, rinat Salesforce is doing that. Microsoft is doing it. Google is doing it. So if you have access to Salesforce, it's already, it has a thing called Agent Force. It's not planning to hire any software engineers this year because it thinks that Agent Force has improved the productivity of its existing staff so much that it doesn't need new engineers.

Rinat: 00:25:40

No, I know it exists. I think someone even created an agent on GitHub as soon as chat GPT was released. No, I'm thinking like something like the movie "Her" I'm just thinking how, when would that kind of agent become

Amit: 00:25:55

but for that kind of agent, you need to give access to so many things. You have to give health records, emails, your financial records.

Rinat: 00:26:03

Yeah. No, I'm not saying that the extent of their power, of course that is something in real, this real life is not a movie that needs to be controlled appropriately. To the extent that kind of if I wanna say multimodal things that, that AI agents are able to do, not just trained to do one or 10 specific tasks, but have a generic understanding of all kinds of different tasks that there might be might need doing. And I feel big techs who are in the market with operating systems have a particular advantage , on expanding in this kind of area. They can easily integrate. And to be honest Android has integrated a lot of AI tools within text writing. And even when you are chatting, you can get AI written summary response. And I've heard recently, Apple has also caught up with some AI integration within their phone as well. But yeah, obviously you can see it in the phone, but it's still it's not taking direct actions on behalf of you inside the phone. But yes, Google is really well placed. Latest release of Gemini is also a lot more, very advanced. It's very advanced. I was looking at a tool in AI studio, which is a stream it's streaming. So basically what it does is that you you can define what you want to share, like you can share your screen and you can talk to the AI agent and you can say that, okay, I'm sharing a screen. I'm going to do a certain task, and I want you to observe what I'm doing and then maybe create a user documentation for me. Or you can say that, okay, I'm trying to work on an Excel sheet. I want to suggest certain formulas that can help me change or transform the data that you see. So basically it's an assistant that's working with you as you are doing things, and it can guide you step by step.

Amit: 00:28:00

So it's like a teacher. A mentor or a coach with you while you're trying to solve something. Say for example, you want to learn a very complex tool, say video editing. There's a very famous video editor called DaVinci. If you want to learn the tool, you can share the screen and you can say, okay, tell me what you see. Then you have a video and you want to say, okay, I want to edit this video. Tell me how do I edit it. It'll basically tell you, speak to you, and then you can do those steps. And if you do something correctly, it'll say fine. And if you do something incorrectly, you can check with it. This tool is right now available.

Rinat: 00:28:33

To be honest, yeah. Chat, GPT when in, in one of the paid versions, I've had like conversations with it, and you could even see live stream, what in front of you, and it will interact with you based on what it sees. So yeah, it's it's pretty exciting times.

Rinat: 00:28:46

For the last two years with the advent of ai, it's been so fast paced with all the new releases every few months from all the big tech, and it's just, it is just a very exciting place to be especially for consumers because this is one of those rare scenarios where an oligopoly market has been created with the big techs.

Rinat: 00:29:08

Everyone's competing with each other, so the consumers are benefiting the most, but it won't probably stay like this. And the sooner agentic automation is absorbed by various businesses. There's gonna be more business case for big techs to price consumers out of it. What's your thought on agent AI agents being open source and is there any development on that side? Consumers or the regular people can always have access to it. It may not be as good a quality as as big techs will release, but it would still be something. Anyone can start at their own business with their idea or invent something and file a patent, et cetera, et cetera. So it's very important that regular people have access to AI agents. And what's your thought or knowledge on this side?

Amit: 00:29:56

So I have, I've used certain AI agent tools, and I've seen that those AI agent tools do have free plans that can do specific things, but only up to a certain extent because bear in mind, these AI agent requires compute, requires the processing and requires some amount of storage, so they cannot give you everything for free. There has to be some cost associated with it because giving a prompt and asking it to do something on your behalf has to happen on a computer that's doing some processing, and that consumes energy, so someone has to pay for it, right? So it's not for free. So yes, there are certain agents that do offer free tier, but of course there are professional tiers or pro tiers.

Amit: 00:30:39

Now what I feel is that there is a plethora of tools and what is going to happen is Google and Microsoft are going to copy those tools and integrate into those products. So essentially you have a single plan, say with Google or Microsoft, and you pay for that. And with that plan, you get access to everything.

Amit: 00:30:57

Say for example, I pay for YouTube premium. So with YouTube premium, I get access to YouTube music, I get access to YouTube, and then I can share it with my family and friends. Now if I want to buy a subscription for an AI agent tool, then imagine for every different AI agent for every different task, I need a different subscription.

Amit: 00:31:16

Sooner or later it's going to add costs. It's just do I need Netflix Prime, Disney Plus Paramount, et cetera. Do I need all these subscriptions or can I just do with Netflix? So it'll come down to what services do you need? Can I get that through Google or Microsoft or say Apple? And if I can't get that, do I then pay for it?

Amit: 00:31:35

For now, I'm paying for chat GPT because I feel that is useful, so I'm paying for it. But if chat GPT like functionalities in future available, say within Google through Gemini, then I would stop paying for it because it's unnecessary cost for me. So it's I mean my take is that there will be an acquisition phase.

Amit: 00:31:54

Once these tools develop, the big companies either will copy those features into their products or they will buy those companies. I'll give you an example. TikTok came out with shortform video. Before that no one was doing Shortform video. Then instagram came out with reels, YouTube came out with shots.

Amit: 00:32:12

Everyone started copying that TikTok short form video format. And now. TikTok is still very popular, but YouTube and Instagram have already copied those features. So suppose I, for example, don't use TikTok. I don't need to use TikTok because I can, if I want short form videos, I can see that on YouTube or Instagram.

Amit: 00:32:30

So the problem with all these plethora of tools is how many tools can you have? How many tools do you want to pay for? So you are only going to pay for tools that you are going to use. And in your day to day life, what are the tools that you generally use? You generally use an email account, some kind of a shared drive, some kind of a media portal, and then some set of office applications like Excel, PowerPoint. It could be open source, it could be Microsoft, it could be Apple. So I've, I see a future where there might be, feature integration into the existing tools from Google, Microsoft, Apple, Meta or they will acquire these companies.

Rinat: 00:33:07

I very much agree with most of the things you have said because yeah, that is what it looks like. And that is where I feel a little bit scared because yeah, if there is always a as it is right now, you have to pay for Chat GPT Pro . But it is still accessible by regular people. If the barrier of entry, like for example I've not seen it, but I've heard that OpenAI also released a 200 Dollars subscription.

Amit: 00:33:31

that is a big difference in terms of affordability. Not everyone can really afford 200 pounds a month. And if the service you receive is day and night difference then that means people who has money will progress more. And thus in a future society, regular people will get poorer while only select few will get richer. And that's obviously something that we wanna avoid.

Rinat: 00:33:56

If big techs acquire all of these small companies and it becomes three or four big tech companies who are providing all the services. If they then realize that yeah, it has cost to them and they have to make profit, I totally understand that. And they realize that businesses are the ideally suited places like banks and other businesses are where I wanna provide my services to, like Azure and AWS I mean, as a a single user, rarely would you find an Azure VM user, although I think everyone could use cloud services like Azure, but they don't because of various things. It could be training or whatever. The point is that when once big techs realize that, okay, I have a bigger market. If I just sell to business and then have the consumer price so high that only select few can afford it, then it's worse for regular people. And how do we protect that?

Rinat: 00:34:53

As you mentioned running an AI engine is very costly. Humans have a few trillion neurons. As the technology progresses and becomes more and more streamlined, companies like Nvidia in five or 10 years, they might be able to process a trillion parameters in-house. I'm hoping in 10 years time somewhat IT savvy person at home can run a 1 trillion parameter AI model, which could be a pretty close comparison to the number of neurons a human have. And then the barrier of entry for that person to do whatever they want in the world is open. And that's the future we wanna get to. That it's not, these kind of technologies are not available to only the people who can afford it. And the affordability is really high. So that's one of the things I'm thinking that how our society might change in five or 10

Amit: 00:35:51

I think there is a point here and I do agree that yes there might be a digital inequality, and we have covered this in one of the episodes previously, that getting access to internet. Not everyone has access to high speed internet, so that creates a digital inequality so people with higher access can get access to videos, et cetera. And same thing might happen with AI agents or AI tools. Some people with a lot of money can get access to these tools, but people who don't have money, they'll not have access to these tools. So I see a point. And let me backtrack. There are open source LLM models available. There is a website called hugging face.

Amit: 00:36:23

Hugging face is the repository for all the open source LLM models. And I'm guessing there might be an a AI agent hosted there as well. I'm not sure Meta has outsourced all. Its Lama open source, all its LAMA models. So yes, I know Meta is doing a good thing by open sourcing its models.

Amit: 00:36:41

Currently what you can do is you can download the LLM model on your computer, and if you have a good enough graphics card, you can actually run the model on your machine and you can ask the model to look into a particular folder, and then you can query things around that folder. Or you can ask it to go through your documents and you can ask it to give you certain responses based on specific inputs from yourself.

Amit: 00:37:04

So it can do that, and that is completely offline. If you want to do something online, say integrated with Gmail, integrated with your, say OneDrive, Google Drive, or say banking information, then you'll have to give access. Don't think it's for free. You'll have to pay for it. So there might be a barrier there, but there are open source models available that can train on your data provided you have a powerful system.

Amit: 00:37:27

Yes, absolutely. There are open source available for the, the AI technology that are, popular today. But I just really hope that the AI agents, when those become more popular, then there is a equally, a speedy force of open source AI agents that are also becomes available because that would make a big difference to the regular people and diminish digital inequality, et cetera.

Amit: 00:37:53

One of the things that I think we have missed so far talking about is the speed at which we can now generate content or generate code or do task or increase our productivity. So I think one of the aspects of using AI agent is, if you use it, what are the benefits? Say for example, I want to create a prototype, or I want to create a static website for a small business owner, I. Who's my neighbor, say on my high street, that is a shop, they don't have a website. They come to me saying, Amit, you're an expert in technology. Can you help me create a website?

Amit: 00:38:25

Now, initially that person would go to a web studio or a web web designer, and they would pay like a couple of thousand pounds and they would have to go through a lot of meetings and to generate a prototype. Now with an AI agent, I can generate a prototype in say, a few minutes instead of a few days.

Amit: 00:38:44

So I can share the prototype with the business owner and they can then gimme feedback and then I can edit it. And as soon as the version is ready, I can deploy it and it's done in a very quick fashion. So that means that my productivity has now improved. The cost of generating website has reduced.

Amit: 00:39:05

The AI agent is going to transform the way we produce things or the way we do things. And if we don't use these tools, then what will happen is we are going to be left behind by someone who's using these tools. So suppose in your office, if you're not using Chat GPT, and someone who's using Chat GPT, they would be able to give far better, far faster output than you and more professional.

Amit: 00:39:29

So you might be left behind if you don't use chat GPT. So there might be a, there might be a digital divide just by just because someone is using an AI tool and someone is not using an AI tool.

Rinat: 00:39:41

absolutely. We come back to that that quote. You mentioned a few episodes back. AI will not replace jobs. People using AI will replace jobs.

Rinat: 00:39:51

Exactly. Yeah, exactly.

Rinat: 00:39:52

People in our society has to be more open to new tools that coming about it, it can't be easier than this. Like language models, you literally just speak to it. But then. Yeah, I mean with that, the barriers of entry of anything, like if you have an idea now materializing it is so much more doable than it was before.

Rinat: 00:40:11

You needed funding. As you the example you gave about a website. If you just had a business idea which needed a website, you needed a proper funding to even go to a web studio and have it designed, but you don't know, you literally just go to your hosting provider or domain provider. And they have an AI integrated tool where you can just say that make me a website, which is for an artist, or which is a, e-commerce store for, I don't know, for face masks.

Rinat: 00:40:40

And it will create, including image generation and a few other kind of eye catching features within your website, within five, 10 minutes. And that significantly reduces the barrier of entry for small businesses. And that's really good because that's what pushes our society towards a meritocratic society where people who has ideas, who are talented, who are hardworking, will progress.

Rinat: 00:41:08

And that's what we wanna do. We want more and more meritocratic society. That was my worry when we were talking 10 minutes ago, that, if the access to these tools, which can do agentic automation is not open to the public, then there will be more and more division and it would be apocalyptic, I

Amit: 00:41:28

Apocalyptic, and I think that's where the governments have to come in and then enforce that, okay, there are certain basic AI skills that everyone needs to have, and it needs to be part of the education curriculum, et cetera.

Rinat: 00:41:40

Absolutely. No, this was very eye-opening conversation. Very much enjoyed all the information and the insights. And it was also good to discuss what the future holds in a realistic fashion. So yeah, I hope our audience enjoyed this too. And any thoughts or any debates or any, anything you wanna share with us, please do reach out.

Rinat: 00:42:02

And also please reach out if you would like to come on the show as a guest. We would very much love to talk and get people's opinion and perspective and share it with the world again. Yeah, very much looking forward to hearing from all of you guys. Thank you very much for listening today, and we hope to see you again soon.

Amit: 00:42:22

Yeah. Thank you so much all. Had a good conversation. Thanks a lot and catch you guys later. Bye.

Tech Talk with Amit & Rinat

Episode 88

25th Apr 2025

AI Agents

Transcript

Listen for free

About the Podcast

About your hosts

Amit Sarkar

Rinat Malik