HN via remix.js for vilnius.js

by surajrmal 3 days ago

You must belong to the club of folks who use hashmaps to store 100 objects. It's amazing how much we've brainwashed folks to focus on algorithms and lose sight of how to actually properly optimize code. Being aware of how your code interacts with cache is incredibly important. There are many cases of using slower algorithms to do work faster purely because it's more hardware friendly.

The reason that some more modern tools, like jj, really blow git out of the water in terms of performance is because they make good choices, such as doing a lot of transformations entirely in memory rather than via the filesystem. It's also because it's written in a language that can execute efficiently. Luckily, it's clear that modern tools like jj are heavily inspired by mercurial so we're not doomed to the ux and performance git binds us with.

inejge 3 days ago | [-2 more]

> You must belong to the club of folks who use hashmaps to store 100 objects.

Apparently I belong to the same club -- when I'm writing AWK scripts. (Arrays are hashmaps in a trenchcoat there.) Using hashmaps is not necessarily an indictment you apparently think it is, if the access pattern fits the problem and other constraints are not in play.

> It's amazing how much we've brainwashed folks to focus on algorithms and lose sight of how to actually properly optimize code. Being aware of how your code interacts with cache is incredibly important.

By the time you start worrying about cache locality you have left general algorithmic concerns far behind. Yes, it's important to recognize the problem, but for most programs, most of the time, that kind of problem simply doesn't appear.

It also doesn't pay to be dogmatic about rules, which is probably the core of your complaint, although unstated. You need to know them, and then you need to know when to break them.

jstimpfle 2 days ago | [-0 more]

Most code most people work on isn't about algorithms at all. The most straightforward algorithm will do. Maybe put some clever data structure somewhere in the core.But for the vast majority of code, there isn't any clear algorithmic improvement, and even if there was, it wouldn't make a difference for the typically small workloads that most pieces of code are processing.

I'll take it back a little bit, because there _is_ in fact a lot of algorithmically inefficient code out there, which slows down everything a lot. But after getting the most obvious algorithmic problems out of the way -- even a log-n algorithm isn't much of an improvement to a linear scan, if n < 1000. It's much more important to get that 100+x speedup by implementing the algorithm in a straightforward and cache friendly way.

surajrmal 3 days ago | [-0 more]

My core complaint is that folks repeat best practices without understanding them. It's simple to provide API semantics that appear like a map without resorting to using hashmap. I fear python style development has warped people's perception for the sake of simplifying the lives of developers. And all users end up suffering as a result.