The Liquidity Event Podcast: Episode 90

 

Episode 90: The Past, Present and Future of AI

On a special episode of the Liquidity Event, our hosts welcome Matt, a machine learning expert with a lot to say about AI. We talk language models, statistics, prompt engineers, ChatGPT, and the humanity of AI. If you’re into AI or fear your laptop is sentient, listen to this episode.

Airdate: 4/20/2023

Read the Full Transcript:

Presenter:

This podcast is for informational purposes only and should not be considered tax or investment advice.

Welcome to The Liquidity Event, a show about all things personal finance, with a laser focus on equity compensation hosted by AJ and Shane of Brooklyn FI. Each episode will take you through the week's news on FinTech, IPOs, SPACs founder wins and fails, crypto, and whatever else these nerds think is interesting. Learn more and subscribe today at brooklynfi.com.

AJ:

Hello and welcome to The Liquidity Event. We're your hosts AJ.

Shane:

And I'm Shane.

Matt:

And I am Matt.

AJ:

Ooh, curve ball right there. This is episode-

Shane:

Who's that?

Who let this guy in?

AJ:

This is episode 90 being recorded on Wednesday, April 11th, airing on Friday April 21st. Hopefully there have been no major world events or financial news happening by the time this episode airs. On this very special recording. We are going to be talking all about AI. We have an AI expert specialist Jack of all trades, or Matt of all trades, we should call him. Here on the show today, Shane and I are going to interview him and ask him a bunch of questions about how this shit actually works. Matt, it's great to have you in the studio. How are you doing today?

Matt:

I'm doing well. It's exciting to be back on, and I've been loving how much AI has been popping up on your podcast recently. Always exciting to hear someone talk about my industry.

AJ:

What is the dumbest thing that either of us has said about AI? That's wrong.

Shane:

Oh no. Here we go.

Matt:

I got to say, I feel like I hear lots of dumb takes on AI, but you guys have been pretty good. Oh, I think there was one comment, I think-

Shane:

Oh shit, here we go.

Matt:

...Had mentioned. This was more about the industry, but I think somebody had made a comment about all Copilot being owned by OpenAI, and I think that was the only thing where it's not owned by OpenAI. But beyond that, you guys are doing it exceedingly well.

Shane:

I wonder who made that mistake.

AJ:

I wonder who made that mistake.

Shane:

I wonder who said that.

AJ:

That's so weird. Well, we're going to learn a lot today. I know, Matt, you're very passionate about this and have been working in the space for a long time. What's going on your side of the world? You're not in the United States, is that correct?

Matt:

I am not in the United States. I'm in Belgrade, Serbia. Same place I was in the last time, where every standard machine learning American lives. But yes, I'm over here. I will say that tech here does seem like it's really exploded recently, so that's exciting see. The quality of developers and just how skilled people are is really nice here also. So it's been a great fit.

AJ:

Sick.

Matt:

Can't believe I've been here for like a year now.

AJ:

Wow, amazing. Amazing. Are either of you, are you watching or reading anything interesting at the moment? Past Shane tax season?

Matt:

I feel like Shane will have a more interesting answer than mine.

Shane:

It's April 12th, so...

AJ:

So, no.

Shane:

So, no.

AJ:

Cool.

Shane:

I am feeding IRS prompts into ChatGPT, though. Which is fun.

AJ:

Cool. How's it doing?

Shane:

Publications, not prompts, prompts are different. I guess it is a prompt. The publication is the prompt. The prompt is the publication,

AJ:

Man is the machine. I am watching the third season of Ted Lasso, which is just always delicious, light, fun, goodhearted comedy. So I highly recommend if anyone is not on the Ted Lasso train, get on it right away, even if you don't like soccer or football. Just a reminder to our dear listeners, we have our podcast survey up for one more week. The deadline is this weekend. There'll be a link on our website page or show notes, brooklynfi.com/episode90. You can fill out that survey to receive a $10 Amazon gift card, so do that if you have not already. Should we hop into it? Let's talk AI.

Shane:

I think we might want to reintroduce Matt to the audience because we want to give some credentials on why we want to understand, why we're even asking Matt these questions. Because Matt did develop, he was our software developer for one of our proprietary tools, which is, I would not say like a heavy machine learning tool, but there is machine learning involved in there, but your background was, I don't know, 70, 80% machine learning in the past. Matt, give us a little bit of a rundown, a little bit of background on your professional development.

Matt:

Yeah, for sure. So I've been a little bit all over the place, but my undergrad studies were in statistics. I worked as a statistician for a while. Then I studied machine learning, studied mathematics, and most of my career has been, like you were saying, probably 80% has been in the machine learning space and then a small amount out of the side of that was in app development. But in the last maybe four or five years, I've gone quite a bit heavier into the app development and marrying up these two technologies together, which has been a lot of fun and something I've really enjoyed doing. So when I met you guys and got to work with what you were doing, it was part of the perfect project to bring those skills together.

Shane:

And how old are you, Matt?

Matt:

I am 33.

Shane:

All right, there we go.

AJ:

33 and a third perhaps?

Shane:

When somebody says my whole career has been in this topic, you got to ask them how old they are.

Matt:

That's true. That's true.

AJ:

Just cut them down to size.

Matt:

I've been coding for machine learning since maybe like 2012 or so.

Shane:

Amazing. So you've seen the arc of machine learning bend towards AI, maybe not. Well, we're going to ask you about AGI, is that how we pronounce it or is that the acronym for AGI general...

Matt:

Artificial General Intelligence.

Shane:

Maybe we should wrap up the conversation there because that will be the Skynet version of this conversation.

Matt:

I like it. I like it. I can really go off the walls of the theories of where that's gone.

Shane:

Hell yeah, let's do that. I'm sorry, AJ, I think I interrupted your initial question. You want to get hop back into it?

AJ:

Can we just get a definition of AI? You're right, Matt, chat GBT is in every article. That's what we're talking about because that's what a layperson understands. They can go to openai.com and type in a prompt and be amazed by AI. But what are we actually talking about when we talk about artificial intelligence in your community?

Matt:

If you ask five people, you'll get five different answers, but the way that I think about it is it's a specific field of machine learning where we're trying to mimic the process of how people and how brains learn. So for hundreds of years we've been taking this idea of [inaudible 00:06:56] statistics and trying to build models to explain the world to ourselves. And those methods have gotten a lot smarter. They've gotten better over the years since we've been improving them. But then when our competing power got really, really good, well now instead of solving an equation one time by hand, we can solve a thousand different variations of that equation a million different times.

And that unlocks a lot of capabilities for us to model the world more accurately. And the majority of everything in machine learning is mapping inputs to outputs, trying to find patterns in there and then being able to make a prediction about the output, about what goes into the input. But we had a breakthrough where, smart people came up with this idea of, all right, here's a rough approximation of how a brain works. We have neurons in our brain and then electrical signals go through them. And as we repeat a pattern, or see a pattern and over and over again, we develop nodes on those neurons and then we have an electric path. I'm sure anyone who actually knows how neuroscience works will hate that explanation.

AJ:

I think for most people it's good enough.

Matt:

But that's also a good analogy here though, because the way that we do it in machine learning is also a really bad approximation of how the brain does it. But we try to mimic that and all of these things, especially this, it's just some creative ways about combining statistics, combining linear algebra, combining calculus, and you randomly initialize a bunch of mappings between your inputs and your outputs and then over and over and over and over again, you [inaudible 00:08:51] incrementally what is going on with all the different numbers.

And then eventually after you do that enough times, you're really, really good at mapping those inputs to the outputs. And generally when someone's talking about deep learning, they're talking about some version of this neural network approach. Back propagation of randomly assigned weights across the...

AJ:

Cool.

Matt:

So that's how I think about it. Someone else might also include anything where we're smart at being able to predict, there's a lot of really good regression and classification algorithms where we can say, what is this thing or predict the number. But generally that doesn't seem like what people are talking about when they mention AI.

AJ:

Because from our perspective, we heard about AI being developed and should we be afraid of it because the robots are going to get too smart and take over. But from your perspective, why was ChatGPT so different from everything else that had been developed before it? Why was it so impactful for the rest of the world outside of the machine learning community?

Matt:

That's a great question. Of course, it is a really impressive model and it is a technological breakthrough, but we've had plenty of technological breakthroughs in the past in this field that did not get the same level of exposure. But I think the main reason here is how accessible it is to somebody who hasn't studied what's going on here. You can go into ChatGPT and you can ask in something that you could not ask Google, and then you get this ability to interact with it. And we're starting to see something that mimics what we attribute to being uniquely human in that. Whereas we had amazing things in the field before, but unless you had studied it for a long time and you knew how to code and you have the right data set the path through it didn't really make sense or it wouldn't be as clear how powerful it was.

AJ:

Speaking of Google, it sounds like the way that Google revolutionized search in that it was a easily accessible, beautiful blank canvas to index the internet. It seems like ChatGPT was a similar approach where maybe there were other technological models that are maybe even better at actually synthesizing information and having a conversation. But literally the design and approach of ChatGPT for a person to be able to chat with it- It looks like an AIM conversation.

Matt:

Yeah, it is. Exactly.

AJ:

ASL. Has anyone asked ChatGPT ASL? What were you going to say, Shane?

Shane:

I wanted to speak on that topic and one thing I wanted to draw attention to is, I am curious about the history and what Matt's talking about. And I think the appealing thing about ChatGPT is it does have this human layer, which I've heard is a security layer. Actually it serves two functions. One is that it talks a human and it also makes sure that if you ask it a question about the underlying... And when it goes and references the underlying language model, it doesn't bring you bad stuff. There's a layer that says, actually, we're not going to talk about how to make dynamite and we're not going to talk about how to administer poisons and all that.

And I'm just curious if there's a parallel to the way when programmers use a language like Python, it gets converted into assembly. Because there's layers that go down to the actual machine, the ones and zeros and the frame and the mainframe. So I didn't know if that was a suitable parallel or a suitable analogy, Matt, and if there were other language models that are potentially better at getting answers, but you just have to be able to speak Python as opposed to the no-code that is GPT, so to speak.

Matt:

I don't think that's a perfect analogy because when you compile code, and I'm impressed that you knew that it gets a compiled down to your assembly. But when you compile-

Shane:

I listen to you when you talk.

Matt:

I appreciate that. But when you're compiling code, there's no element of randomness. There's a very strict sort of rules of how something works. So if you say 10 times five, how do you solve that? There's always a very strict set of processes that you go to to get the answer to that problem. But with something like these large language models, the reason that they're so hard to do, the reason that any of these types of machine learning things exist is because there aren't a strict set of exact rules to be followed.

There's randomness, there's noise. There's times where the rules aren't going to make sense. So you can write the exact, if somebody writes in their statement in Python, this is how you can compile it, and this is how you make the CPU of your computer process the data through it. But if you say, how do you learn to speak German? There's a million different ways you could answer that in multiple languages. So there's a lot more noise to it.

Shane:

It's a bit more biological than the other tools that we've developed.

Matt:

That is true.

Shane:

That's amazing. Speaking of asking it questions, we've heard about these new jobs that come into the space since it's been released. They're called prompt engineering and I don't really know what that means. Can you help me with that? Is that training a biological creature or this biological tool, which I'm going to start calling it from now on, to answer the questions how we want it to, or is it training it to think a certain way? What is prompt engineering? I've heard it pays quite well. I'm curious how that, why it pays so well and how it influences the models.

Matt:

So maybe I'm a bit uninformed here, but when I think of someone who says that they're a prompt engineer, I think of it kind of like somebody who puts Googling as a skill on their resume.

Shane:

Well, it's paying 300 K a year, so...

AJ:

I am very good at Googling.

Matt:

There probably are different levels of the complexity. And of course, there needs to be a step that translates whatever prompts goes into those matrices. So generating embeddings where you need to turn a string of text into numbers, and then those numbers should represent the underlying ideas that are represented in the text, the tone of the text, things like that. So as far as how does the development of the models that go to translate that, there could be some people who are doing some really cool things with it. If they are, they are working on the models themselves, they're not external people who say, "I'm a prompt engineer." On their LinkedIn.

AJ:

So you need a computer science degree, not an English degree to be a prompt engineer is what you're saying. Or you couldn't have a linguistics major with a English minor to be a prompt engineer, you would have to have some coding background.

Matt:

Again, it depends on how people are using it. I'm skeptical about the title prompt engineer, but there could be something that I... I don't want to offend anyone.

Shane:

It could be like a QA person for a video game company that just plays the game and tries to break it, is what I'm reading.

AJ:

Shane's literal dream job.

Shane:

No, that job sucks.

AJ:

So one thing that I've been thinking about, Matt, is I think ChatGPT-4 goes up into the present, but for a while there was this warning on the website that said we only have information up until, I think it was September of 2021. So mechanically-

Shane:

That's still the case.

AJ:

It's still the case. So what is that? Why? Is it because it's going to take so much computing power to get up to date? Why don't we have up-to-date information and what's it going to take to get there?

Matt:

The way that these models work, especially a GPT model, they're a little bit different from how a human will learn stuff. So these models are... They do one really large batch training of the data and then they learn all of their underlying relationships and everything from there. And those are really, really expensive. It takes a huge amount of computation, and some of these trained for months before they're actually satisfied with the results and they have learned the relationships and the texts that they're looking for. So a big chunk of that I believe was they just did one training of the model, and then they probably did do some reinforcement learning and the layer of that on top of it, but it didn't make sense for them to be doing continual retrains, so the actual underlying data wasn't there.

AJ:

Got it.

Matt:

But in that training phase, what they're doing is they're just looking for those patterns and those relationships, they're not actually reading stuff and synthesizing it. So it learns those patterns in a way that when you paste in new information, it can take new information. My understanding is that what they're doing or what they're planning to do is they're having it so when you ask update a question that requires external data, it can now go read data from that website or it can do a Google search and it can take that back. And instead of additional training, it just gives it basically an extension of your prompt to run the model on.

AJ:

Got it. Because ChatGPT at this point doesn't know that Russia invaded the Ukraine. It doesn't know that Trump was indicted. It doesn't know that you can buy Glossier products in Sephora. It doesn't have that information unless you tell it or you send it to a website. So I always found that super fascinating. And for our purposes in our industry, we like to think about what happened in the stock market last year, can we provide some analysis? It's like, we can't do that because we don't have the data yet from ChatGPT. Interesting limitation.

Matt:

And at the same time, I could imagine part of that is then trying to save some money while they're doing this initial testing to fine tune it, right? Because in the same way, they don't want you copying and pasting in an entire book because it's just going to take so much money to crunch through all of that, convert it into numbers, generate some response. I would assume there was an element of, not a technical hurdle, but not wanting it open to the internet just for that purpose.

AJ:

For sure. And we're in the age of $20 a month for ChatGPT, this is like when Uber was $2 a ride, just so there could get a massive of user data to start perfecting the algorithm. So I imagine we're still in that honeymoon phase, so to speak, for use. Early adopters basically we get a super discounted price because we're giving it all of our data and we're teaching it more about the way humans think and what humans want to know.

Shane:

Love to burn VC money. AJ, your questions around resource intensiveness and it being open to the outside world really has me wondering about when the tool is most resource intensive. I know that when they were originally developing ChatGPT-3, 3.5 that they had thousands of GPUs that they were using to train the tool, Matt. Just to get some extra processing power. And it makes me wonder, is the tool most resource intensive only when it's in the oven or when it's on the table? When we're using it or when it's being developed? And I know that it can now use APIs and now has an API and some other tools can now access ChatGPT. And I was just curious of, is that resource intensive for the company? And you might not know the answers to the specific about ChatGPT but I'm also curious if you're excited about the APIs and if you could talk to us about how they work. I know that Wolfram|Alpha now has a ChatGPT API. What does that mean for you and I?

Matt:

Let me unpack that one step at a time.

Shane:

It's two questions.

Matt:

As far as the resource intensiveness, it is going to be intensive to do the generative step as well. So there are a lot of models where it can be really expensive or take a lot of computing resources to train them, but then at the inference step where you're actually making predictions or using it, it's pretty lightweight. That is not the case for language models because with language models, there's still a lot of computation that needs to happen to turn your prompt into numbers and then to pass that through this whole model and patch a bunch of matrices and multiply them together and then turn that back into checks. So without a lot of investment and getting free hosting from giant tech companies, ChatGPT would not be possible in its current state. But that's not my understanding, at least. I don't know this for sure yet. I have never trained a large language model myself. But my understanding is that it probably was a lot more expensive to do that initial training, because it's just around the clock. You got your hardware running at full throttle for months.

Shane:

Cool. And also makes me think about the... ChatGPT isn't the only model. So how should we be thinking about some of the other entrants and some of the original and some of the newer ones coming in, the efficacy and the efficiency of various models. Do you know some of the KPIs? I've seen some papers about LLMs and I've seen some charts that I don't know how to interpret. So I was hoping you might be able to help us with the KPIs that we should be thinking about. Is it the speed of the response, is it the number of neural activations? I don't know. Help us out here.

Matt:

This one is interesting because in some ways we're breaking new trail and the answers are clear for that. So of course it can capture a lot more nuance the more data you capture and the bigger your embeddings are and you can hold more information to be used by the models. But that doesn't always equate to them more accurate. So for example, in one of my previous jobs, we were working with ViLBERT, which is another transformer model. And we were taking the data out of that, but one of the biggest challenges there was we actually needed to localize it. It had been trained on too big of a data set in all of this text, Wikipedia, the internet, all of this. So if we are doing something, in this case we're working with food data, so with recipes and food products, nutrition, stuff like that.

So we didn't care that it had been trained on the seismology of Mars and K-Pop or whatever other things made it in the [inaudible 00:24:10]. We were actually localizing it at that point. We were working to simplify, actually remove data from it. And unlike if you're trying to predict the weather, you have a very clear way of assessing was this good or was this not good? If you say, hey, it's going to be 50 degrees and then it's 40 degrees exactly how wrong your model was. Whereas something like a conversation or just generating text or especially with some of these other ones where we're generating images, we're generating art, audio, stuff like that, that's a lot harder to assess because you don't have those clear metrics. And ChatGPT or open AI in their APIs that you were talking about, they have different versions of it.

So I've played around with them a little bit. They have three or four different versions of every model and they come at different sizes. So they have the one that's supposed to be really fast because it has a little bit less information involved, including the [inaudible 00:25:14], and then they have other ones that are supposed to be better, more accurate, and they're huge, but they're going to be slower because multiplying all those numbers together is going to take a lot more computation. So I realize I didn't give you an exact answer there. I'm sure that researchers have good ways of how they're evaluating these language models, but I'm not sure if there is a clear answer to what our best metrics are going to be. That said, part of what they're doing to get around that is why when you use ChatGPT now, they're making it free, not because... For the main purpose of testing and improving it. So you say was this good, was this not good.

Shane:

What I heard is that it's similar to people. They're going to have different resumes, they're going to have different abilities. Some of them are going to be faster, some are going to be slower, some are specific, some are broad.

AJ:

Some are going to have a good sense of humor, some are going to have a bad sense of humor.

Shane:

And they could potentially talk to each other via APIs.

AJ:

Yeah, exactly. And then kill us all. What are you most excited about in terms of the real life implications of having AI integrated into systems? For us, it's like, oh, we can play with it. We can summarize copy. This is cool. How is this going to change our day-to-day lives? As someone who understands this stuff on a much deeper level than I do, 2, 3, 5 years from now, what do you think is going to be the biggest, most exciting development based on what you've seen so far?

Matt:

That's another tough one to figure. I have all sorts of theories about what I think is going to happen. But I would say probably the thing that I'm the most excited about, also maybe a little bit scared of, but right now all of our models are very specialized in what they do. So if you build a model to you play chess, no human will ever stand a chance, no human can beat a model at chess.

And it's like that with most things. If you build machine learning to predict the weather, no human's going to hold a finger up to the wing and be able to guess better than some really complex model. But if you build one model, a human will be better at it at almost everything except for the one thing it specializes in. Almost what we're starting to get at now is backing out, and not to jump into AGI too early, but we're starting to back out a little bit and broaden the scope of what models can do. Or the ability to orchestrate some of these models so you have one model that works in your controller and knows when to use what other model.

That's an abstract thing that I'm really interested to see how that impacts all of us. At the same time, there's a lot of... In development of software and in research and building models, there's a lot of tedious annoying work that is necessary to do, but it's not that fun. We're starting to see it's possible to automate some of that. We're not very good at it yet. Copilot or ChatGPT can't really replace our code yet, but it is starting to be able to write some of our tests and it's be like, it can give you a good starting point in certain languages and on certain simple tasks. I'm excited to see that get better. Basically, I'm excited to give all the boring parts of my work to ChatGPT and focus on the things I'm good at and actually enjoy doing.

AJ:

The implications across industry is, I just feel like we're such the tip of the iceberg. We just don't even know where it's going to go, and I can't even think about the exciting implications. We are at time. Any final parting thoughts about the future of AI?

Matt:

Yeah, it's going to be interesting. Finding out to... What I hope that I see is that people are able to get successful in AI. So companies are successful in AI because of innovation and not just because of money. Currently in tech you have four companies that control everything, and I was a little bit let down with some of the investment that happened into OpenAI from Microsoft, but I still think that companies who can't outspend these giant companies can out-innovate. And I think we're really, like the beginning of the internet days where there's a lot of opportunity and some really smart creative people are going to be able to come up with something really impactful and innovate.

AJ:

Fuck yeah, let's innovate. Oh, thank you so much, Matt. Always love your perspective on things that are not our expertise and always happy to have you on The Liquidity Event. Thanks so much for listening everyone. You can email us your financial problems at liquidityevent@brooklynfi.com. Leave us a voicemail at memo.fm/liquidityevent, and of course you can find the show notes. And don't forget to fill out our survey@brooklynfi.com slash episode 90. We will see you next week. Thanks for watching.

Presenter:

Thanks for listening to The Liquidity Event, hosted by AJ and Shane of Brooklyn Fi. Head on over to brooklynfi.com where you can subscribe to the podcast or YouTube channel, or if you want to learn about their full service, financial planning, tax, and investment firm specializing in tech professionals and creatives on the path to financial independence. We'll see you next time on The Liquidity Event.