Theo's Typesafe Cult•4w ago

Looking for AI models for my SAAS

m looking for a fast, lightweight, and free (or very cheap) AI model for the first stage of my SaaS, which is similar to Notion or Obsidian. The AI should quickly format and organize programming notes, making them clean and structured without delay. I don’t need anything super smart—just something fast, efficient, and low-cost. It should work instantly so developers can take notes without interruption. I’m open to free, self-hosted models or cheap API alternatives. If there’s a way to define rules for formatting, that would be a bonus. What are the best budget-friendly options for this?

9 Replies

devve2kccOP•4w ago

https://openrouter.ai/google/gemma-3-4b-it

Gemma 3 4B - API, Providers, Stats

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Run Gemma 3 4B with API

devve2kccOP•4w ago

I came across this model Seems good

Mark•4w ago

I don’t know your exact usecase but you can also check way smaller models (like R1 1.5B, llama 3.2 1B or as you linked Gemma, but you can try 1B). Could even include in your app as craft is doing it.

devve2kccOP•4w ago

imagine, i supose that your are a dev, so you want to document your work, or you study about something, but you want to be very fast writting, you can leave spelling errors, write everything in plain text, without worrying about putting headings or bullet lists, and I want the AI to, at the end of writing everything, or whenever you feel like it, click on the AI Button, and that will read all you doc and find the best and understandable way to organize your document, put headings, bullet lists, wrap code, maybe even improve the way explained also the app will have some sort of a quick note system, and on the background i want that the AI to check all notes, and maybe the docs, and create connections between them, like obsidian graph Now I know that it is possible to give certain orders to the model, so that it has a consistent output. But is it possible for it to learn from the user's "context", as if it were their assistant?

Mark•4w ago

Then besides for the embedding, try some local 1B models and see if they are enough for you. And if people don’t want local models you can still offer e.g. Gemma via API. After all, as you’re looking for a budget friendly version, this is probably your best bet as it’s … free 😄

devve2kccOP•4w ago

yeah, i tested gemma 1b localy, and works pretty good, maybe i can fine tune it via google ai studio?

Mark•4w ago

And then I would also try and check how e.g. Craft is doing it. I bet there are also Obsidian plugins to use models (I remember something called copilot). I don’t think you need to fine tune anything tbh

devve2kccOP•4w ago

Hm craft seams very cool

Monkey Inferno•4w ago

I wouldn't recommend self-hosting. It works, but LLMs have become pretty cheap recently, so it's not worth it. Google's Gemini 2.0 flash gives you 15 requests per minute for free, which should be enough during dev, and it costs $0.4 per 1M tokens which is the equivalent to about 8 books worth of notes. And does structured output well. Your explanation doesn't make it seem like you need a lot of expensive calls since it's just sending data and getting a formated JSON output, so the tokens used don't compound with each message as it does with chat apps.

Gaming

Programming

Looking for AI models for my SAAS

Did you find this page helpful?