> the compiler had enough static information to emit a single arraylength bytecode instruction.
I'm skeptical.
If it can prove that the input actually matches the hint, then why does it need the hint?
If it can't, what happens at runtime if the input is something else?
> We replaced a complex chain of method calls with one CPU instruction.
JVM bytecodes and CPU instructions really shouldn't be conflated like that, although I assume this was just being a bit casual with the prose.
Wild speculation: Could the extra speedup be due to some kind of JIT hotpath optimisation that the previous reflective non-inlinable call prevented, and which the new use of the single `arrayLength` bytecode enabled? E.g. in production maybe you're seeing the hotpath hit a JIT threshold for more aggressive inlinng of the parent function, or loop unrolling, or similar, which might not be triggered in your test environment (and which is impossible when inlining is prevented)?
Author of the blog post here. That explanation sounds very plausible to me!
If the whole enclosing function became inlinable after the reflective call path disappeared, that could explain why the end-to-end speedup under load was even larger than the isolated microbench.
I admit that I don't understand the JIT optimization deeply enough to say that confidently... as I mentioned in the blog post, I was quite flummoxed by the results. I’d genuinely love to learn more.