Hacker Newsnew | past | comments | ask | show | jobs | submit | rbanffy's commentslogin

Mercury's climate wouldn't benefit much.

In US terms it's very fast. The US lags behind other developed countries in rail, but I hope it can improve. And, if it improves with electric propulsion, better.

In the absence of a testable definition or “thought”, I will make no judgment.

> For years I've had this issue with pretty much everything happening in China, from business to politics to culture

China is mindblowingly huge. There has to be A LOT happening at any one time.


China's emergence was inevitable - they have the numbers. Last one I heard was 200 million people in STEM careers alone. That's more than the entire US workforce.

I expect technological development to explode and my advice is for anyone interested in it to learn Mandarin. Including myself.


> I expect technological development to explode and my advice is for anyone interested in it to learn Mandarin. Including myself.

My father said the exact same thing in the 80's but it was Japanese.


China’s prospects are mic better - like I said, they have the numbers to outperform everyone else. Japan was a completely different case.

In any case, proprietary drivers would need to be distributed separately from the Linux kernel because of the GPL license.

There are many cluster boards that allow plugging compute module boards that have an onboard switch. Such an arrangement would provide a much denser system. Making a new one, however, requires a lot of work. I'm not even sure how you do ethernet over PCB traces.

One project I keep telling myself I'll eventually do is to make a cluster board with 32 Octavo SoMs (each with 2 ethernets, CPU, GPU, RAM, and some flash), and a network switch (or two). And 32 activity LEDs on the side so a set of 16 boards will look like a Connnection Machine module.


I think the idea is to allow developers to write a single implementation and have a portable binary that can run on any kind of hardware.

We do that all the time - there are lots of code that chooses optimal code paths depending on runtime environment or which ISA extensions are available.


Without the tooling though.

Commendable effort, however just like people forget languages are ecosystems, they tend to forget APIs are ecosystems as well.


Sure. The performance-purist in me would be very doubtful about the result's optimality, though.

The performance purist don't use Cuda either though (that's why Deepseek used PTX directly).

Everything is an abstraction and choosing the right level of abstraction for your usecase is a tradeoff between your engineering capacities and your performance needs.


this Rust demo also uses PTX directly

  During the build, build.rs uses rustc_codegen_nvvm to compile the GPU kernel to PTX.
  The resulting PTX is embedded into the CPU binary as static data.
  The host code is compiled normally.

To be more technically correct, we compile to NVVM IR and then use NVIDIA's NVVM to convert it to PTX.

That’s not really the same thing; it compiles through PTX rather than using inline assembly.


The issue in my mind is that this doesn’t seem to include any of the critical library functionality specific eg to NVIDIA cards, think reduction operations across threads in a warp and similar. Some of those don’t exist in all hardware architectures. We may get to a point where everything could be written in one language but actually leveraging the hardware correctly still requires a bunch of different implementations, ones for each target architecture.

The fact that different hardware has different features is a good thing.


The features missing hardware support can fall back to software implementations.

In any case, ideally, the level of abstraction would be higher, with little application logic requiring GPU architecture awareness.


> Though this demo doesn't do so, multiple backends could be compiled into a single binary and platform-specific code paths could then be selected at runtime.

That’s kind of the goal, I’d assume: writing generic code and having it run on anything.


> writing generic code and having it run on anything.

That has been already done successfully by Java applets in 1995.

Wait, Java applets were dead by 2005, which leads me to assume that the goal is different.


> That has been already done successfully by Java applets in 1995.

The first video card with a programmable pixel shader was the Nvidia GeForce 3, released in 2001. How would Java applets be running on GPUs in 1995?

Besides, Java cannot even be compiled for GPUs as far as I know.


Servers sitting idle is a strange concept. Ideally those resources should be powered down and workloads should be consolidated until the machines reach an optimal level of utilization.

Wouldn’t disagree, but have I got bad news for you.

Global utilisation rates are very low. 5~15%

I guess the reality is that solving that would require a practically impossible level of coordination across the industry.

https://sardinasystemsblog.medium.com/how-can-an-enterprise-...


If you want readily available resources, you can't power down servers when they're not being utilized.

That's not to say you shouldn't make attempts to conserve power with better performance scheduling on the CPU.


> you can't power down servers when they're not being utilized.

You can’t boot a full OS in seconds, but you can boot a thin hypervisor and have compute resources immediately available. Same applies for hard disk drivers that can be spun down or flash devices that can be unpowered when not in need.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact