Modular•11mo ago

software cost estimating

That article does not understand software economics. COCOMO is used as a very rough estimate for early development, when most of the work is cookie-cutter code. It is also generally disputed and not viewed as accurate for projecting real effort. For that matter, LOC is not a particularly good metric for assessing a code base: not for assessing concepts of "maintainability" or "cost" or "defect rate" or any other metric of code health. It is often used as a base starting point (i.e., given everything else being identical, a code base 2x as large will cost roughly 2x as much), but all the other factors are often considered more important. "All the other factors" include things like modularity, code complexity (many methods of measurement), token count (similar to line count, but not just measuring lines; i.e., some lines have a LOT of tokens, while other lines have only one token), and more. I used to work for a company that had many patents around concepts of code complexity. We were one of many companies with a "cost estimator" tool, too. And we all knew (and openly admitted) that our estimator (just like all others) was very, very, very rough. I.e., cost could easily be 50% of our estimate or 2x our estimate -- the statistical deviation was wide. Our statistical relevance was so-so at best. Same with other models. Finance people still loved the models because you have to assess cost projections somehow, even when you know the estimates are widely flawed. Side note: this is one of the many "holy grail" targets. If you can make an accurate (good luck) cost projection tool to help organizations identify "cost to complete a software project," then you have a product you can sell for $$. Just don't step into the ring assuming that nobody has tried. There are literally dozens of companies shilling various models, and they all sorta suck. Very wide margins-of-error (statistical deviations). As of a couple years ago, there was not a product in existence that is substantially more accurate than quality engineering managers giving a "guestimate" based on decades of experience. Some companies sell massive reports that combine all of the current estimation methods (we sold our estimates to a couple prominent report-generation companies, like Black Duck), but there is no single "this number is pretty good" solution. At least not as of ~2 years ago; last time I was working in that field.

60 Replies

curiosity.fan•11mo ago

thank you

bunnyOP•11mo ago

I am well aware that people will come in here saying:

Well my org uses ABC-XYZ Method, and it's been great for us.

Super cool. I am still confident that ABC-XYZ Method, when used on tens of thousands of code bases, will come up as "meh" at best. I say that because we ran all of the methods with open math against 16,000 code bases, and saw widely varying results. And we had the real cost results (from clients) for ~1k code bases. None of the models were consistently accurate.

curiosity.fan•11mo ago

So Would you have a guesstimate for how much LLVM cost was ?

bunnyOP•11mo ago

I guess one closing comment: Black Duck's "just get all the methods into one report" was probably the best. It just gives Merger & Acquisitions (M&A) teams a TON of data to consider. The M&A teams can decide what they feel is the best cost estimates for the software they're acquiring. But really: get some senior managers who've been through stuff and can give "A Grizzled Vet's Estimate." oh hell no 😂 Way out of my expertise. I was part of the team that was building the estimation software, but like we were implementing the math from "white paper" into "code." And scraping for code bases. And running tests. Doesn't mean I know the stuff; I couldn't fully invent or even explain the best actual software estimation models, let alone try to assess cost on a project. 😄 But Chris might have perspective on real costs. Maybe the team kept time & resource logs -- both human and infrastructure. I don't know. I guess given the wide deviations we saw in our lab experiments, and that COCOMO estimate of $500m, I'd hazard a wide-ranging guess of:

Anywhere from $250m (1/2) to $1b (2x).

😂 my cheap cop-out answer oh, another HUGE complicating factor:

How bullet-proof must this software be?

I.e., building 100k-LOC video game is much cheaper per LOC than building a 100k-LOC military aircraft controller -- lives are at stake, the software is classified, testing is rigorous, government regulations must be followed, etc, etc, etc, etc.

Darkmatter•11mo ago

The GHC estimate is forgetting almost all of those devs are PhDs who are doing the development on grant funding.

curiosity.fan•11mo ago

This makes me realize how valuable LLVM was, and at the same time how cheap it is given that it runs almost all silicon on earth.

bunnyOP•11mo ago

I think the article was playing the "what if Microsoft had made it internally" type of analysis game.

Darkmatter•11mo ago

Have you seen some of the stuff in GHC?

bunnyOP•11mo ago

That was almost an internet meme in the 90s, like with Linux and the rise of FOSS.

Darkmatter•11mo ago

There are haskell features from the 90s which were considered new and confusing when Scala tried to introduce them to a general audience decades later.

bunnyOP•11mo ago

I don't recall the acronym "GHC" off-hand.

Darkmatter•11mo ago

Haskell

bunnyOP•11mo ago

oh! The Haskell thing. Yeah. I saw that one.

Darkmatter•11mo ago

It's the haskell compiler.

bunnyOP•11mo ago

I don't know much about it. but I know it has some cost and complexity estimators included, right?

Darkmatter•11mo ago

Haskell is where all of the programming language nerds with Math or CS PhDs do development.

bunnyOP•11mo ago

But it's also really 100% Haskel-World, so most of our business model just wasn't in that space.

Darkmatter•11mo ago

Having anyone else try to replicate that compiler would be massively expensive.

bunnyOP•11mo ago

Most of our target was "stuff that needs total rebuild" -- Cobol, Java, Perl/Py/Rb/JS/TS, and others. So we'd try to give companies an estimate on "what will it really cost you to rebuild your stuff in a different language."

Darkmatter•11mo ago

Really, Java? Running from memory costs?

bunnyOP•11mo ago

But our main product wasn't so much about $$ cost (tough to estimate; we gave ours as a freebie add-on) -- our main product was more about refactoring code along the way. I.e., so long as you're shifting from Cobol to X-Language (whatever one), then how do you assess code maintainability and make future migration easier. and slow. Java is sooooo slow (for a compiled language).

Darkmatter•11mo ago

Once the jit gets going it can get pretty close to C perf it you're not horribly abusing it.

bunnyOP•11mo ago

my Java buddies always tell me "Java is super fast. I took this Python script and rewrote it...."-- Lemme stop you there.

Darkmatter•11mo ago

I'm a C/C++/Zig/Rust person. Java is the level directly up from that due to the giant amount of money in that JIT optimizer. Now, you can't write "Clean Code" (tm), since that is actively performance harmful.

bunnyOP•11mo ago

I was C/C++, but gladly moved closer to pure data work -- lot of Python. And I've played with Rust (I admire that language a ton). I use Go a lot too, but it seems to be stalling out a bit (dunno, just a personal feeling). All I know is we had a significant chunk of customers seeking estimates on Java rebuilds.

Darkmatter•11mo ago

Go for me came down to feeling like the language was for people who didn't want to learn new things. See the current "iterators are too functional" debate.

bunnyOP•11mo ago

That's why I liked it. 😂 Like sometimes I just wanted "fast Python" without making custom C/C++ py modules. And now we know my obsession with Mojo, even if I've been distracted from it for the last month. 🙂 hope you realize I'm mostly joking with the "that's why I liked it" quip -- I love learning new stuff, but clients don't always love it

Darkmatter•11mo ago

Clients don't like new things, I agree. I've spent the last several weeks convincing a client I'm not a snake oil salesman because I can get over 500k messages per second into a server. "Your phone company uses this library" wasn't helpful.

bunnyOP•11mo ago

I cannot begin to express how excited I am to change my business model to "I help peeps convert from Python to Mojo" -- that's some consulting engagements I think would be really fun. To include the coaching/handoff where you ensure the org's tech-team can handle the new code and you don't become a maintenance engineer by proxy.

Darkmatter•11mo ago

I'm torn on whether python people will be willing to use Mojo. There still seems to be a debate about whether types are good (they are), and they will add compile times.

bunnyOP•11mo ago

yeah, introducing new libs can be scary. And I do get that the client just wants to control their tech stack -- any additions (including libs) introduces risk, cost, etc. But it's difficult to get them to shift.

Darkmatter•11mo ago

I'm concerned the notebook version of mojo will slow down over time and become unusable for the things I do usually use python for (data analysis).

bunnyOP•11mo ago

oh, facts. But if we can get Mojo to truly be a super-set of Python, then I can convert people to The Mojo Way in a dark, sinister, and passive-aggressive way:

cp script.py script.mojo
mojo script.mojo
# cheer as it runs flawlessly *and* faster

cp script.py script.mojo
mojo script.mojo
# cheer as it runs flawlessly *and* faster

Darkmatter•11mo ago

I'm able to say "These companies are paying money to support the project, and it's a Linux foundation project, and it's already in your distro's repos" and still have issues:

bunnyOP•11mo ago

then, you start sprinkling in some types -- get a bit faster. Coach & educate about the side-benefits of types ( {we all sing in unison: safety, compile-time checks, ...} 😂 ). Next, make a few def-to-fn changes.....

Darkmatter•11mo ago

Then it might work, hopefully.

bunnyOP•11mo ago

could take a few years though -- superset of python, that is I mean, realistically, it could be a while till it's just s/py/mojo/

Darkmatter•11mo ago

Considering C++ can't maintain being a superset of C, longer than that.

bunnyOP•11mo ago

but I'm patient. I'll play the long game.

Darkmatter•11mo ago

restrict, #embed, etc. How restrict isn't in yet baffles me.

bunnyOP•11mo ago

if we can even get "superset-ish" then I'm beyond happy 🙂

Darkmatter•11mo ago

Honestly, just making a sane package ecosystem would be great.

bunnyOP•11mo ago

I decided during my last month away from Mojo that I want to help nibble around the edges of implementing all the pythonic stuff.

Darkmatter•11mo ago

If compiling from source is more reasonable then most of python's issues go away. I'm waiting for C interop, then I'll go port liburing and try to push that as the default way of doing io.

bunnyOP•11mo ago

I saw Jack chatting about pixi. I'd be pretty happy if Modular worked with Pixel to get pixi to be Mojo's manager. It's highly inspired by Cargo, which I think is pretty much the gold standard of managers that I've dealt with.

Darkmatter•11mo ago

Cargo with the ability to specify more hardware targets would be nice.

bunnyOP•11mo ago

re this: Over the last month, I've still talked about Mojo with clients and friends. They all have the same basic "but I couldn't do list comprehensions / other python features" as a reason they won't pursue it more. And I think winning over more Py peeps will really gain momentum for the language. So, gonna work it a bit. 😄

Darkmatter•11mo ago

For instance, specifying CPU and GPU targets

bunnyOP•11mo ago

100% agree. So much agreement. All the agreement.

Darkmatter•11mo ago

I think I can help the most with providing networking that blows everything else away. I can port that RPC library that does 500k RPS (single core) to Mojo.

bunnyOP•11mo ago

I haven't done much GPU stuff lately (other than basic s/pandas/cudf/ stuff), but I've done some WASM stuff. So the same can be said across a wide range of compile targets. oh! I lost track of time. Alarm just went off-- I have to jet for a call in 1 min. 😂 👋

Darkmatter•11mo ago

Bye

bunnyOP•11mo ago

quick addition: It really leaves me excited to see so many smart peeps contributing cool stuff to Mojo. I've done a few cool things in my past, but in very specific domains. And very not-applicable-to-general-world. I feel like my contribution surface is rather limited. But that doesn't mean I can't get all pumped up when I see folk like you doing such cool stuff. ❤️

Darkmatter•11mo ago

| not-applicable-to-general-world Think about how most servers react to being fed 500k rps. The response from most engineers to that level of load is to declare a security incident. I'll do my best, but I'll probably need to throttle it down a bit.

bunnyOP•11mo ago

feature flags:

from Darkmatters-Package import RPC_Server, LudicrousSpeed
my_rpc_server = RPC_Server(speed=LudicrousSpeed)

from Darkmatters-Package import RPC_Server, LudicrousSpeed
my_rpc_server = RPC_Server(speed=LudicrousSpeed)

😂

Darkmatter•11mo ago

and a quarter gigabyte of libraries...

bunnyOP•11mo ago

oof

Darkmatter•11mo ago

userspace drivers are fun like that.

moosems_yeehaw•11mo ago

Bunny lore has been divulged 😉 I also noticed that it only used the line count at the most recent point and failed to take into account any and all refactoring over the years and additions/deletions in the past

Maxim•11mo ago

Reminded me of my game dev days. We had multiple city builder games developed at the company. My team built a game with just ~12K LOC of production code and ~36K LOC of test code in one year. Other team had ~100K LOC production code and no tests in 2-3 years. Just looking at the LOC is like buying software by the byte 😄. 2GB = 2M $. 😜

Gaming

Programming

software cost estimating

Did you find this page helpful?