Union.ai

Flyte

Data Processing

Training & Finetuning

Event

May 2, 2024

•

Min Read

The Essential Role of Vector Databases in LLMOps

Sage Elliott

Zilliz provides enterprise-grade AI technologies, including one of the world’s most popular open-source vector databases, Milvus. They specialize in creating AI/ML applications that are easy and fast to develop, regardless of the organization’s size. In this fireside chat, we talk with Yujian Tang, a Developer Advocate at Zilliz about the Essential Role of Vector Databases in LLM applications.

This Union AI Fireside Chat Covers

Introduction to Yujian Tang, developer advocate at Zilliz
What is Retrieval Augmented Generation (RAG) for LLMS?
What is LLMOps (Large Language Model Operations)
What is a Vector Database?
Why Are Vector Databases Essential for LLM applications?
What is Multimodal RAG?
What is your LLMOps Developer Stack?
What is Zilliz Building around Vector Databases and LLMs?
What are you looking forward to in the AI Space?

👋 Say hello to other AI and MLOps practitioners in the Flyte Community Slack.

⭐ Check out Flyte, the open-source ML orchestrator on GitHub.

Full AI interview transcript of ‘The Essential Role of Vector Databases in LLMOps’

(This transcript has been slightly edited for clarity.)

Sage Elliott:

My name is Sage Elliott. I'm going to be the host of today's Fireside Chat. And our topic today is the essential role of vector databases in LLMOps . We have our guest, Yujian Tang, here with us.

Before we start the conversation, I'll also shout out if you are working on something interesting, especially around AI space or machine learning, and you want to come on and give a fireside chat like this. It's around a 30-minute-plus event that usually goes over a little bit. Reach out to me on LinkedIn or our flight community Slack, which is also in the video description.

And with that, I will get into our conversation today. Thank you so much for coming to Yujian.

Yujian, it's always a pleasure to work with you and listen to you talk about vector databases. You're the person I know who knows the most about them, and you always answer questions well. So, I'm excited to have you here.

Before we get into vector databases, please provide an introduction and tell us a little about your background and what you're currently working on.

Introduction: Yujian Tang, Developer Advocate at Zilliz

Yujian Tang:

Yeah. Thanks for having me, Sage. I always love doing these streams with you here. I'm always making these really nice, really easy, and I always feel prepared for these. So yeah, my background is mostly I did a lot of contests, and that was actually how I got into software engineering.

I won some computer science contests; I did some research on machine learning in college. I did a bunch of research on computer vision and data integrity. Then, I worked on the AutoML System at Amazon for a while before getting into startups. I enjoy working on the cutting-edge and all, like the cool A.I. stuff I get to work on now.

So, I've been at Zilliz since April of last year, and since then, I've been building a bunch of apps on top of RAG for databases.

What is Retrieval Augmented Generation (RAG) for LLMS?

Sage Elliott:

And what is RAG? We'll probably get more into this later, but in case people don't know what RAG is.

Yujian Tang:

Yes, retrieval augmented generation (RAG). Someone I had a call with in December told me that there's both RAG and the opposite, which is "GAR." I think it's “generative augmented retrieval” or something like that. The difference between these is the order in which you're using the LLM.

RAG is “retrieval augmented generation”. What that means is you go and you try stuff from the database before you send it to the LLM and then you generate something. And then the other one (GAR) is you generate something and retrieve something from the database based on what you're generating, which I assume is more like something like if you're going to make something like you're playing Jeopardy, I don't know, like you take an answer and you're like, Give me the question.

And then you go and you have a better database full of questions and you pull those out or something like that.

Sage Elliott:

Interesting. I hadn't heard of it called GAR before, but it makes sense you might want to generate a response and retrieve data in that order.

What is LLMOps (Large Language Model Operations)

Sage Elliott:

All right. So, our main topic today is the central role of vector databases in LLMOps. Before we get into more of the vector database part, maybe we can talk a little bit about what LLMOps is. So if you could describe what LLMOps is that would be great!

Yujian Tang:

Yeah, I think, you know, over the last few years we had a rise of MLOps companies and with the introduction of LLMs, we now have this kind of like we already know that we need MLOps for when we do email production and now introducing this new class of model.

I suppose I don't really know if I would call them new types of models, but new classes of models like these large language models, we know that we actually have to do these in production too. We have to have auth0 and these basically kind of like dev ops, ML ops, info, ops, whatever. But one of the things that is different from like the alum stuff from the mob stuff is you're not really like looking at, you know, how is this the steps to training the model and hosting it and running inference on it, but it's more like, okay, like what?

I mean, you have to take care of some of that stuff as well. But you also think about things like, how is my model performing in production? Because the focus with outlines is really they allow you to kind of do these new interesting applications that you weren't able to do before.

Sage Elliott:

Yeah. And it's interesting, like it's almost harder and then a lot of other ML applications evaluate performance and production and then keep improving that. And so now we have things like and there's other pieces that also, like you said, you know, this ties into a lot of other ML applications too, but you have things like not only your training and retraining, which you probably do if you're hosting your own model, not if you're probably using, you know, an off the shelf API, right?

But might be that you'll want to tie things together. Like we'll talk more about how vector databases are used in it, but that's usually a pretty big piece of a lot of applications right now. And same with engineering. And then also, how do you measure whether that prompt engineering or that prompt that you changed affecting your model in a positive way and doing the things you want?

It's a very interesting problem. Right now. There are all these different tools that we can tie together and the LLMOps piece is depending on what your application is, but tying these different tools together. And usually I think part of that is ensuring your models are doing the thing you want and then changing things over time to probably make it better over time.

Yujian Tang:

Yeah, yeah. A lot of it is like, is like changing, like not just like the props and stuff, but also like, like you're a little like configurations of how you're, how you're using it. So like, I don't know if this is like, I can't remember if this question is going to come up later, but like, for example, like one of the things that we do with better databases is like the whole like chunking and embedding and like the pre processing your data and like that is like something that you have to tune as part of your whole like LLM ops thing in the URL and stack.

What is a Vector Database?

Sage Elliott:

Yeah, it makes a lot of sense. All right. So before we go into, I guess tying it even more into the LLM applications, what is the vector database? And I think you always have a really good explanation of this. I've seen different variations of your answers in different talks in the past. Sometimes you use Taylor Swift as an example, sometimes you use something else.

But for people who don't know what a vector database is and how is it different from a regular database?

Yujian Tang:

I wish I had my visual of Taylor Swift. I think it's a fun visual. It gives you all the exercise of being a vector database yourself, but actually vector databases are essentially different from the regular databases in the way that it operates and the data that it operates on. Vectors are a series of numbers and vector databases are built to find to operate on these series of numbers to compute and find the closest other series of numbers.

Most of the time when we work with vector databases, these vectors that we're working on are vector embeddings, and vector embeddings are the numerical representation of them, meaning behind the input data into a specific model and embeddings models, which we'll probably hear a lot about, especially when it comes to RAG. Embedding models are the models that produce these vector embeddings.

The way they work is you train your model to do something. And typically a model does some sort of prediction or some sort of classification. And that's what the last layer of the model does. Let's say you're like the very class sample is like the MNIST dataset, right? These are a bunch of hand written numbers from 0 to 9. And the last layer is ten neurons, it's ten outputs. And just the highest probability one is the one that you get as your response. If you were to get the vector embedding from this, what you do is you would remove that layer, and then you would get the output from the second to last layer, and that would be your vector embedding.

And the reason we take that embedding layer is because as input comes through a network, each layer of the network learns something new about the input in the last layer containing all that information. And that's kind of what the vector is, and that's the data that the vector databases operate on. And the main kind of difference when you think about why would I want a specific kind of database or a specific kind of database architecture or a setup to do this is if you think about the way we currently work with data, SQL databases, NoSQL databases alike is really we're doing a lot of Key matching, right?

When you're using a single database, you're selecting something from some table where some set of parameters match, and so you're optimizing for this ability to do a quick match and do basically entity relationship modeling on the fly. And with no single databases, what you want is what attributes these entries may have, but I can kind of map all these entries, these IDs, and I can allow a flexible attribute schema.

So no single databases are also optimizing for this ability to do key matching. So, when it comes to vectors, you're doing a bunch of computations because the only way that you can evaluate how far vectors are in space is by doing a bunch of computations. And nothing that is the implemented key to matching is optimized to do these kinds of computationally expensive tasks.

Sage Elliott:

And what would be an example like I like in this example, what would be maybe an example with language of like, like what would a vector look like between maybe two words I think is a good example of kind of showing people like I think often when that comes up in books is like, you know, queen and king and certain favorite things from each other. And so there's a decent number because, you know, Queen and King are different words, but they're associated with a position in royalty or something.

And so they have a space between them where it's, you know, relatively small and then versus at what would be something completely different, like a potato. Potato is not a king or queen. It's not a person and it's not a position. And that number, if that's a distance between how related it is, the number that between a potato and a king would be probably very large, right?

Yujian Tang:

Yeah. That is also one of my favorite visual examples to show potatoes. You know, the whole game. We could, we could start using a yam and regular potato. I don't know what other kind of potatoes…

Sage Elliott:

I Kind of like that actually I might use food groups from now on. I feel like food groups would be a good example between a relationship for vectors.

Yujian Tang:

Yes. And I feel like it might even cause more debate in the audience. See what people have strong opinions about whether or not food belongs to certain groups.

Sage Elliott:

How far is a hotdog from a sandwich? That's going to be fun. That's a good question for a vector database to answer.

Yujian Tang:

My God, What if we get into the whole thing like is a hotdog a taco thing? This is going to be fun with what's going on.

Sage Elliott:

It will be fun, we could do a whole different talk. We can give it some point, but tying it back in here, I think the example between words and you're kind of storing that distance in between those words is like a vector space and you're storing those in a vector database which is optimized to kind of return and search that very fast.

Yujian Tang:

So it's like the usual text I guess like a natural language example that I get for this is like, let's say you have a report about a company. You know, you're going to have words like “profit”, “bottom line”, “profitability” and these are all the same words. And if you were to use a keyword search, you would have to go through three different keyword searches to find them.

Or if you are using NLP powered keyword search, maybe you'll have to do two. You can look for “profitability”, but with vector search. The idea is that because you can just say like, “profit”, I know that in, in vector space, “profit” “profitability” and “bottom line” are all basically right next to each other. I can just pull all of these words back from my search.

Why Are Vector Databases Essential for LLM applications?

Sage Elliott:

Yeah, that makes a lot of sense. So this is a perfect segue into our next question. Why are vector databases essential for so many LLM applications? And we've kind of touched on this a little bit, I think in some of our other questions here, but can you dive a little deeper into how they're used, why it's important for LLM applications, and why so many people use them? Maybe give an example of how it adds value to an LLM.

Yujian Tang:

I guess the main use case that we've seen for vector databases in LLMS and LLMS applications recently is the Retrieval Augmented Generation (RAG) use case. And most of the time that's in the form of a chat bot. And so what exactly databases do is you'll have an LLM and your LLM is able to take data and take text and format it and give it to you a human readable response.

And a lot of these like cool fancy features, but it still does have access to your data. So one of the, I guess the primary use cases for using an LLM in that stack, is that right? Sorry. Using a vector database in that stack is that vector databases can give an LLM this kind of new knowledge on your data.

LLMs can take their query and their text query vector is that and use similarity search. Or maybe they specialize like some part of the query or something. And then you can use a vector database to do a similarity search to get back all of the information that you have that is most similar to whatever it is that you need to query about.

So that's the use case for vector databases with LLMs and RAG. That's the primary use case.

Sage Elliott:

Could you give an example of someone querying an LLM for a chatbot. They're querying a chat bot and they're asking a question around “how do I use a vector database” or something like that? There is a vector database involved that gets that query and they're going to look for similar questions or similar information.

So you could store previous questions and answers in the vector database. You could also just store a lot of documentation, and when someone asks that question, it's going to look at these documents and say, Well, this seems like the most similar document or the most useful one for this type of question. I'm now going to bring in this information from that document to the LLM.

Most people here have probably used ChatGPT even if they haven't used a lot of other LLMs. Basically how ChatGPT works right now when you're asking questions, It kind of has a history of what you've previously asked. So it knows the context around what you're talking about. That's basically just storing your whole text history there. And it kind of has context because it's like that kind of becomes part of the prompt now. And that's kind of what's happening here with the vector database, right? You're just bringing in this external information. You say, here's a lot of context that maybe you didn't have before. We think it's going to be helpful for you to answer this question now that you have that, and then it's going to generate an answer. But with this extra content it was able to retrieve very fast from a vector database.

Yujian Tang:

Yes. That is pretty much absolutely correct on exactly how that works. I guess there could be some interesting things to do with that. For example, when you were talking about, you know, saving the questions, the answers, that's why we created this thing called cache, which is like a semantic cache that allows you to kind of hit the cache results before you go and look at the LLM.

So there are similar methods that people are doing around things exactly like that. But I think the most basic way to think about it is like if you were to ask what is your database? And you have a better database full of documents, then your plan would say, okay, usage, better database pulls it and then does exactly what you just said, which is question answer with this context, blah, and then it'll format it human readable and send it back.

What is Multimodal RAG?

Sage Elliott:

Very cool. And it's funny because this is a question I asked in our last fireside chat, and I don't think it's a question that's going to be asked in many more of them. But we've mentioned RAG a few times, and if you're working in this space, you probably see it come up all the time. Something we see coming up now a lot is “Multimodal RAG”. And I was wondering if you can answer or touch on what is multimodal RAG?

Yujian Tang:

Yeah. So I think as far as I can tell, there's no industry agreement on what a multimodal is right now. But I think that there are many ways you can implement something and it would be multimodal RAG. And so some of these ways and this is basically what I'm going to be working on, or at least what I would like to be working on this quarter is like our include things like let's say you have images and text and maybe your images are like a scene of let's we'll just take a let's say you're making a chat app about Seattle.

Yujian Tang:

You have images of Seattle. You've got the I forgot what it is called… the Space Needle. You got like the Space Needle and you've got like, you know, Lake Union and a bunch of other different things. And then you also have text descriptions of Seattle. A multimodal RAG for this would allow you to search the images as well as the text description.

So this has descriptions like the history of Seattle and the images of the landmarks and the way that it would do that. The most basic way to do that is to have another model generate descriptions of the images and save those text and embed and save those texts as the vector embeddings, and then include a link to the image inside of your entry.

And then when you are searching, you can say, “I want to show me some landmarks in Seattle” or whatever, and it'll be like, "Here's a picture of the Space Needle” or something like that. So that would be an example of like the most basic form of multimodal RAG. Other like versions of that would include things like maybe you have images and videos and you have a certain image you want to see if it's in some videos or maybe you have videos and you want to search other videos or maybe you have images and you want to search through images and text, and your text descriptions generate images and you want to find the closest text. So the closest images or just things like that would be examples of what multimodal right is.

Sage Elliott:

Is it almost like you're combining different vector spaces together? So you have in this case, images and text and you have the Space Needle and a really good description of it. And then you let's say you take your own photo of the Space Needle and you, I don't know, do some sort of search for that photo saying like, hey, like what is in this?

And instead of just having like a very basic concept of captioning of saying like, “well, it's a building”, you know, or it's a tall building or whatever, it's actually going to query some vector database, say, “hey, this image is actually pretty similar to these other ones we have in here” and associated with those we have all these other context around it so we can actually pull in instead of just saying, you know, it's a building, we can say, this looks like it's the Space Needle. It's located here in Seattle, and it can actually give a detailed description around it.

Yujian Tang:

That would be one way that you could implement something and it would be like a version of multimodal, right, in terms of like the vector space thing. So, there are ways to include sources. This offers a way to include multiple vector spaces. Now it might not be out yet in Milvus, but it will be coming.if it's not released yet. But you can know dense, sparse vectors, multiple vectors and things like that. Actually. Yeah. This is released and that's one way to think about it. But I wouldn't say it's like combining vector spaces. It offers you the ability to search multiple different vector spaces.

And then in the example that I gave of taking an image and turning it into a text and using that as a vector, then what you're actually doing is you're more or less mapping vector spaces, right? So you have the vector spaces of the images, but what you're actually using is the text description. And so you're trying to create this mapping and actually one of the challenges with this is you're never going to get a 1 to 1 mapping.

And part of the reason for this is just like there's generation involved and that's statistics, right? We're already kind of messing with some probabilities, and then it might be you would have to do like the only I think, well I hope I don't say anything that's like mathematically impossible here, but I think like the only way you can you can actually get that to do like a 1 to 1 mapping is something that is non deterministic, right?

So you have to use something like a not a machine learning model to do that, but also the challenge with this I guess the concept of combining vector spaces is that you can only compare vectors of the same length, and typically, image models and text models have different length vector outputs. So I have actually heard of some people saying that they just like, let's just tack them on to each other and just like, do vector search that way. I haven't heard anything about these results, but I've heard that people have tried it.

Sage Elliott:

That's interesting. Yeah, I think when I said combining, I probably meant mapping, but that's interesting too, actually, that that's a thing. Yeah. And I guess you could add the space between or tokens or whatever to your vector and tack them on. That's interesting. I'd actually have to look at it like a paper and see how that works.

What is your LLMOps Developer Stack?

Sage Elliott:

So you've been building some really cool projects with LLMs. Could you tell us a little bit about your development stack around LLMs or LLMOps? I know you bring in at least vectors databases a lot of the time, maybe some other tools like any other like models, tools, or libraries that you're kind of enjoying and that you suggest other people look into if they want to build some sort of LLM application, even if it's kind of for themself, or maybe they want to put something in production.

Yujian Tang:

So strangely, this is one of those tools aimed at enterprise grade, like all this kind of stuff. It's aimed at that. And most of the tools in the LLM stack are not. Most of the tools in the LLM stack are pretty new and the most fun ones anyway.

I really enjoyed using those. That's mostly what I use. And then I use a lot of Hugging Face for different LLM models or embeddings models. You know, it's got the mini models and it's also got like a bunch of other open source models. I think even Llamas are on there now. I think Deci has an LLM on there too.

There are quite a few other LLMs on there too, so that's one of my, that's my main LLM testing out of most of what I've been doing is I playing around with islands and models. Other like parts of the alum stack that I think that if you're especially if you are going to go into production that you need to use are like eval tools, right?

I can give a shout out to WhyLabs here, Arise Apporia. And then for orchestrating and building more machine learning model stuff, you would use, what?

Sage Elliott:

Yeah. And I totally forgot to even include a blurb at the intro here that I work at Union, which is where this fireside chat is run and we help maintain the open source project called Flyte, which you can check out on GitHub, It is around orchestration and you can kind of, you know, help tie all these tools together, make sure your application can scale. It's built with a Kubernetes backend, and then we also offer a managed solution.

But back to Yujian as well. I included a link. I think there's links to most of the stuff around Zilliz and this in the description if you're watching this, but I also just put it in chat. You can check out the GitHub link for this and as someone who has GitHub repos their work, I know they always appreciate a star, so give me Milvus a star for being open source.

Yujian Tang:

All the open source projects, it's like give us stars, we need them!

Sage Elliott:

Stars are fun. Yeah, so check out the open source project or open source vector database Milvus, which is awesome. It has a ton of stars. A lot of people use it. It's production ready, like you mentioned. Yeah. Not even just something that you can play around with. Like if you're looking for a vector database to bring in production, that's when you can definitely go look at and bring in.

‍Yujian Tang:

You know, Salesforce, Walmart, eBay it has some big, you know, big users, use cases.

Sage Elliott:

Yeah. Awesome. And you mentioned yeah, Hugging Face is amazing. I feel like they have most of the models right now. I think even Llama2. I believe you can at least run in a Hugging Face space. That's my go to place if I'm not using the openAI API or if you want to fine tune your own model and host it somewhere.

I think even recently I think just this week, Microsoft's Phi2 as well, which is the really small LLM that you can find and I believe is on there as well.

Yujian Tang:

Dude. Okay. So recently I've also heard this thing like small language models and I'm just like, okay, 1 to 7 billion parameters is what people are now calling small language models. And I was like, That's pretty big.

Sage Elliott:

Yeah LARGE language models. And now they're so big that people are saying the smaller ones which were the original large language model are small at this moment. That is kind of funny because I think Phi2 is 2 billion parameters. The Deci LLM is, like , 7 billion. So if you are looking for some cool small ones, you should go check those out too and see if they are right for you or if you use a case.

Yujian Tang:

I also just tried out one from Symbl AI. I think they're also, I don't know if they're based in Seattle, but at least they have some people in Seattle. And so one of their guys was like, Hey, we have this and we should check it out. And I was like, All right, we'll give it a go.

Sage Elliott:

Oh, yeah, I think they are based here. I could be wrong. I know at least some of their leadership is around in Seattle.

Yujian Tang:

Cool.

What is Zilliz Building around Vector Databases and LLMs?

Sage Elliott:

All right. So we've talked a little bit about this, and you mentioned Dallas and stuff. Can you tell us more about what Zillis is doing?

Yujian Tang:

With concerns to Milvus or just at a general level?

Sage Elliott:

Maybe a general level? I mean, you could talk more about Milvus if you want, but also, you know, Zililz is the company behind all this. And as you know, I think you're building more than just the database. So you can mention both if you want or dive deeper into a project as also you're really excited about.

Yujian Tang:

Okay, cool. So what is Zilliz doing? Obviously, we maintain Milvus. This is our main project. This is our open source vector database. So this is what I like for now. But outside of maintaining all this, you know, we obviously commercialize all this since we are a company. And so Zilliz is the cloud managed version of this and it's hardware optimized and we do a bunch of like benchmarks around all this stuff.

We do a lot of events, so we do like the unstructured data event that we're going to be talking about next week, which we have in both S.F. and Seattle. And I really want to get that going in New York as well. That would be really nice. Let's see what else. We also have a bunch of other projects that we have worked on.

So for example, like earlier I mentioned semantic cache. semantic cache is actually really popular. And in terms of downloads, which is really surprising because we don't like to really talk about it, we don't do anything with it, but it is like a systematic cache. And I think it's one of the only semantic caches that are out there. And basically we created it because we were running up a huge open API bill and we were like, Yeah, let's not do that.

Sage Elliott:

Can you just touch on a little bit like what that is like why people would use it?

Yujian Tang:

yes, yes. So let's say you have a public facing application. So in our case we have this thing called OSSChat. It's like OSSChat.io. Yeah. So this process chats on OSS and if you get a lot of questions and many of these questions are frequent, frequently asked and very similar, then it does you a lot of good to not have to yell and to answer the questions every time it does you a lot of good to save the questions and basically have them in some sort of LRU type cache.

Yujian Tang:

And since we are a vector database company and we used this as our semantic cache so that we can save the semantic values of the queries and not just the keywords. And so this allows us to basically filter out a lot of the frequently asked questions, which we also ask a lot of ourselves when we're testing.

And so instead of paying, you know, thousands, I think it was actually like tens of thousands of dollars in opening bills every month. We got this down to like under 10,000. I'm actually pretty sure this is like I'm actually pretty sure it's only like a few hundred. I think it's actually pretty low now. So yeah, basically as a cost optimization tool, which is probably why it's popular.

Sage Elliott:

You're basically storing responses as a cache. And then when someone's asking that same question, you can just pull from that cache, that answer. You don't have to query an API or your model.

Yujian Tang:

Yep. So like, maybe like so maybe a lot of people suddenly want to learn about, you know, how do I build neural networks and PyTorch and you got 100 people asking the same question. Well, they probably don't ask you so differently every time, you know, the different capitalization, maybe different spacing, maybe different words, wording, etc., etc.. But because, you know, you have a cache of the vector database with the vector search and, you know, the semantic search there, you essentially have this so It doesn't matter if they're all worded differently, you can still get the same response back.

Sage Elliott:

Awesome. That makes a lot of sense. It sounds like a great feature, and I totally get why it's popular.

Yujian Tang:

Yeah. And then the other thing that we've been working on is pipelines. So we just released pipelines, which are available to people who are using Ziiliz cloud’s free tier. So Ziliz cloud has a free tier as long as we are in operation tier and we added pipelines into it so people can easily kind of get started with piping their data into vector databases because this is like one of the biggest questions I got last year.

It was just like, How do I get my data into a vector database? Can I just give the vector database my data? People were like, I'm sending JSON so that it should be a database, but it's not working because of something. And it's like, well you need to vectorize data and then people say, how do I price the data?

So pipelines make it so that people can achieve this. I'm just going to give you some data.

What are you looking forward to in the AI Space?

Sage Elliott:

Cool. All right. So one last question here is I like to ask every guest, what are you looking forward to in the field that maybe isn't vector database related? I mean, I guess you could use a vector database to build it. But what if something else you're like, excited about right now in the industry?

Yujian Tang:

I think there's a few things I'm excited about. So one is I'm really excited about this kind of blow up of more open source projects. I'm always kind of I've always been a fan of this kind of grassroots bottom up kind of movement type of set up. And I think that open source is really good for that.

And also I think I'm also very happy to see more interest in the community. A lot of that interest that's coming from Web3.0. So that's an interesting thing to note. So yeah, I'm really excited about that. And then very far off into the future, I just can't wait for the singularity, you know?

Sage Elliott:

And before that, maybe like a robot that can do dishes or something like that.

Yujian Tang:

Yeah, maybe like, you know, my God. Wow. A robot that could do dishes that would be really nice and put them away. You know?

Sage Elliott:

So someone asked about sharing links that you mentioned. I shared some. I think they went into the YouTube chat, not the LinkedIn one. And I will after this. We're going to wrap up here soon. I'll make sure of the links that I shared in YouTube, I'll put them in the LinkedIn stream as well. So if you commented on it, whoever asked that, I'll put them in a little bit later and you can go click on those links to movies in the meet up and stuff like that.

Awesome. Thank you so much for coming on. It's always a pleasure having you around and answer questions around vector databases there. Anything else you want to shout out before you leave? I think we've already kind of covered some of the upcoming events and stuff. Anything else you want to give a quick shout out for?

Yujian Tang:

Yeah, I'm just waving to Mert in the audience.

Sage Elliott:

Shout out to Mert!

Yujian Tang:

One more thing to shout out. I'm going to be hosting a Hackathon series this year. First one is going to be January 27th. Everything's set up for yet, which is why it isn't announced. So this is your pre-announcement announcement.

Sage Elliott:

Very cool. And when you have that announced, let me know. And I can also include a link in the show notes for anyone who's coming back and watching this later. So with that, I think we'll go ahead and wrap it up. Thank you again. Yujian, for coming on. Thank you again for everyone who came and listened and anyone who asked questions and I'll make sure I update the description and LinkedIn comments with some of the links that may have been missed before.

Sage Elliott:

So with that, go ahead and the stream. Check out our community in the description as well. So if you want to come join the flight community, it's kind of a general maps slack that you can join. And I'm also hosting a book club, which right now we're reading Designing Machine Learning Systems and meeting weekly on that. So if that's interesting to you, check out again the community link in the description as well, but that I'm going to go ahead and hit and stream there might take a minute to actually wrap up here.

Yujian Tang:

Cool. Thank you.

Watch or listen to the full conversation on YouTube:

The Essential Role of Vector Databases in LLMOps

This Union AI Fireside Chat Covers

Full AI interview transcript of ‘The Essential Role of Vector Databases in LLMOps’

Introduction: Yujian Tang, Developer Advocate at Zilliz

What is Retrieval Augmented Generation (RAG) for LLMS?

What is LLMOps (Large Language Model Operations)

What is a Vector Database?

Why Are Vector Databases Essential for LLM applications?

What is Multimodal RAG?

What is your LLMOps Developer Stack?

What is Zilliz Building around Vector Databases and LLMs?

What are you looking forward to in the AI Space?

More from Union.

Flyte 2 OSS: Now Available for Local Execution

Union.ai Completes $38.1 Million Series A to Power a New Era of AI Development Infrastructure

Building Crash-Proof AI Systems

Get updates on new features and releases

Solutions

Resources

Company