Is my backend team right?🤔

So I'm working on an AI image gen app and I proposed a full text search feature that would allow users to search all their generated images based on the prompt. The problem is our image collection is over 1 billion rows (mongodb btw). They pushed back and said they tried this in the past and it was not performant because the collection/table is too big and it would sometimes take up to 8 seconds. Now I believe it's probably because they were doing full text search on the full 1 billion+ table/collection, but if we query by first filtering it down based on userID first then perform the full text search on that much smaller set it would be performant right? I only know SQL database so I'm not sure but it wouldn't be similar for mongodb too? And if you know more performant way to do this, please let me know🙏
8 Replies
Rhys
Rhys14mo ago
You can use a database like ElasticSearch which is designed for searching text ElasticSearch stores your text in something called a reverse index and so it’s extremely fast to search it You also can index the content without storing it to lower your storage amount ElasticSearch / Typesense / Algeria will work well as a starter search and then in the future you could move to embeddings with a vector database and search the embeddings
Anna | DevMiner
Anna | DevMiner14mo ago
Don't forget about Meilisearch (it can even do vector search)
Luc Ledo
Luc LedoOP14mo ago
We're not at that point yet. Need to make mongodb work for now.
Rhys
Rhys14mo ago
https://www.mongodb.com/basics/full-text-search Have you made an index on the prompt?
MongoDB
Full-Text Search: What Is It And How It Works | MongoDB
This article explains what full-text search is and how it can be enabled in your database.
Luc Ledo
Luc LedoOP14mo ago
My backend team handles that but yes, I don't think they would be that incompetent. It's more of a problem with the collection being 1 billion+ rows Someone proposed this solution, tell me what yall think.
const MongoClient = require('mongodb').MongoClient;

async function searchImages(userID, searchString, page, pageSize) {
const client = new MongoClient('mongodb://localhost:27017', {
useNewUrlParser: true,
useUnifiedTopology: true,
});

try {
await client.connect();

const db = client.db('yourDatabaseName');
const collection = db.collection('yourImageCollectionName');

// Perform the search combining user filtering and full-text search
const searchResults = await collection
.find({
userID: userID,
$text: { $search: searchString },
})
.skip((page - 1) * pageSize)
.limit(pageSize)
.project({ _id: 0 }) // Optionally, exclude the _id field from results
.toArray();

return searchResults;
} finally {
client.close();
}
}
const MongoClient = require('mongodb').MongoClient;

async function searchImages(userID, searchString, page, pageSize) {
const client = new MongoClient('mongodb://localhost:27017', {
useNewUrlParser: true,
useUnifiedTopology: true,
});

try {
await client.connect();

const db = client.db('yourDatabaseName');
const collection = db.collection('yourImageCollectionName');

// Perform the search combining user filtering and full-text search
const searchResults = await collection
.find({
userID: userID,
$text: { $search: searchString },
})
.skip((page - 1) * pageSize)
.limit(pageSize)
.project({ _id: 0 }) // Optionally, exclude the _id field from results
.toArray();

return searchResults;
} finally {
client.close();
}
}
Basically, I believe this solution it would not matter how big the collection is since we are using limit to take only x amount.
West side ⁉
West side ⁉14mo ago
Algeria?
Rhys
Rhys14mo ago
Algolia
Site Search & Discovery powered by AI
Create AI-powered search & discovery across websites & apps.
West side ⁉
West side ⁉14mo ago
oh gotcha
Want results from more Discord servers?
Add your server