Hacker Newsnew | past | comments | ask | show | jobs | submit | kelindar's commentslogin

This might be useful to some if you need a very light pub/sub inside one process.

I was building a small multiplayer game in Go. Started with a channel fan-out but (for no particular reason) wanted to see if we can do better. Put together this tiny event bus to test, and on my i7-13700K it delivers events in 10-40ns, roughly 4-10x faster than the plain channel loop, depending on the configuration.


This library was created to provide an easy and efficient solution for embeddings and vector search, making it perfect for small to medium-scale projects that still need some vector search. It's built around a simple idea: if your dataset is small enough, you can achieve accurate results with brute-force techniques, and with some optimizations like SIMD, you can keep things fast and lean.


I love that you chose to wrap the C++ with purego instead of requiring CGO! I wrapped Microsoft's Lightgbm library and found purego delightful. (To make deployment easier, I embed the compiled library into the Go binary and extract it to a temp directory at runtime. YMMV.)


This post led me to purego, and I've just finished moving my toy project that uses PKCS#11 libraries from cgo to it. It's so much better now! No need to jump through hoops for cross-compilation.


IME Linux and macOS users usually have a compiler available so CGO is mostly only a hassle for Windows, but on Windows this capability is built into the Go stdlib, e.g. `syscall.NewLazyDLL("msvcrt.dll").MustFindProc(...)`


Thank you for pointing out this option. Any idea why the Go stdlib doesn't offer this for Linux and macOS? I'd rather not add compiling other languages to my Go workflow.


How is the latency of calling purego bindings vs cgo? The latter seems prohibitively expensive for most of my projects.


IIRC, purego repurposes a lot of cgo machinery, so I don't think there would be much difference. For my purposes, it doesn't matter since the ML library does several seconds to minutes of work using multiple cores per call.


I haven't checked (I make maybe 10 calls per second at most). Intuitively, they should be similar.


Have you considered using HNSW instead of brute force?


Honestly, I enjoy programming in Go and been using it on a daily basis for the last few years. Most importantly, when it comes to performance it's often not the language that matters but how you structure your code. It's very much possible to build a terrible C++ program which thrashes memory and will be very slow. And I feel like Go is actually lacking those nice data-oriented libraries.


That's the idea, a transaction commit log decoupled from underlying durable storage allows you to build your own persistence layers. I'm still thinking to build a simple (memory-mapped?) layer, but as an optional, separate lib.


Wait no, that repo was an experiment that I'll be rebasing and finally building a real ECS based on the columnar storage library.


It's actually possible, columns are simple Go interfaces and can be re-defined and defined for specific types. You can easily build implementation of columns that actually load data from disk or even a remote server (RDBMS, S3, ..?) and retain the indexing capability.

On the flip side, you could actually fit more data in-memory than with non-columnar methods, since the storage is column-by-column, it compresses very well. For example boolean values are stored as bitmaps in this implementation, strings could be stored in a hash map so there's only one string of a type that kept in memory, even if you have millions of rows.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact