Cheyenne Super Computer Auction

https://gsaauctions.gov/auctions/preview/282996

261 points by zrules on 2024-04-29 | 155 comments

Archive links

Comments

mchannon on 2024-04-29

I once bought a far larger supercomputer. It was 1/8 (roughly) of ASCI Blue Mountain. 72 racks. Commissioned in 1998 as #1 or #2 on the TOP500, officially decommissioned in 2004, purchased my 1/8 for $7k in ~2005.

Moving 72 racks was NOT easy. After paying substantial storage fees, I rented a 1500sf warehouse after selling off a few of them and they filled it up. Took a while to get 220V/30A service in there to run just one of them for testing purposes. Installing IRIX was 10x worse than any other OS. Imagine 8 CD's and you had to put them each in 2x during the process. Luckily somebody listed a set on eBay. SGI was either already defunct or just very unfriendly to second hand owners like myself.

The racks ran SGI Origin 2000s with CRAYlink interlinks. Sold 'em off 1-8 at a time, mainly to render farms. Toy Story had been made on similar hardware. The original NFL broadcasts with that magic yellow first down line were synthesized with similar hardware. One customer did the opening credits for a movie with one of my units.

I remember still having half of them around when Bitcoin first came out. It never occurred to me to try to mine with them, though I suspect if I'd been able to provide sufficient electrical service for the remainder, Satoshi and I would've been neck-and-neck for number of bitcoins in our respective wallets.

The whole exercise was probably worthwhile. I learned a lot, even if it does feel like seven lifetimes ago.

bri3d on 2024-04-29

Wow, that's ridiculous. I bought two racks of Origin2000 with a friend in high school and that was enough logistic overhead for me! I can't imagine 72 racks!!

Installing IRIX doesn't require CDs; it's much, much easier done over the network. Back in the day it required some gymnastics to set up with a non-IRIX host, now Reanimator and LOVE exist to make IRIX net install easy. There are huge SGI-fan forums still active with a wealth of hardware and software knowledge - SGIUG and SGInet managed to take over from nekochan when it went defunct a few years ago.

I have two Origin 350s with 1Ghz R16ks (the last and fastest of the SGI big-MIPS CPUs) which I shoehorned V12 graphics into for a sort of rack-Tezro. I boot them up every so often to mess with video editing stuff - Smoke/Flame/Fire/Inferno and the old IRIX builds of Final Cut.

I think that by the time Bitcoin came out, Origin2000s would have been pretty majorly outgunned for Bitcoin mining or any kind of compute task. They were interesting machines but weren't even particularly fast compared to their contemporaries; the places they differentiated were big OpenGL hardware (InfiniteReality) with a lot of texture memory (for large-scale rendering and visualization) and single-system-image multiprocessor computing (NUMAlink), neither which would help for coin mining.

lightedman on 2024-04-30

"The original NFL broadcasts with that magic yellow first down line were synthesized with similar hardware"

That was originally an XFL innovation.

strictnein on 2024-04-30

I'm pretty sure it was ESPN and the NFL? The creator of the tech seems to indicate that and 1998 predates the first go around with the XFL.

https://www.sri.com/press/story/the-first-down-marker-how-te...

dataflow360 on 2024-04-30

The company with the original tech was called Princeton Video Imaging, in NJ.

I interned for them one summer in... 1996?

They had sold their tech to Padres and Giants broadcasts initially to put ads on the outfield.

The first down line came a bit later and was a much better use of their tech.

Princeton Video Imaging even got a shout out (1-line) at the end of every NFL game for a while.

emchammer on 2024-04-30

This sounds a lot like the system which drew the halo and trails on the hockey puck for NHL games, FoxTrax.

dunham on 2024-04-30

Yeah, it's been a while, but I recall hockey getting it first, people not liking it, and then it working out for the football version.

ewams on 2024-04-30

Really cool, do you have pictures or a blog post about this?

mchannon on 2024-04-30

Email me and I’ll send a photo.

Never been much of a blogger. This was from a bygone era, before I (or anyone I knew) had a smartphone.

Isamu on 2024-04-29

>Components of the Cheyenne Supercomputer

Installed Configuration: SGI ICE™ XA.

E-Cells: 14 units weighing 1500 lbs. each.

E-Racks: 28 units, all water-cooled

Nodes: 4,032 dual socket units configured as quad-node blades

Processors: 8,064 units of E5-2697v4 (18-core, 2.3 GHz base frequency, Turbo up to 3.6GHz, 145W TDP)

Total Cores: 145,152

Memory: DDR4-2400 ECC single-rank, 64 GB per node, with 3 High Memory E-Cells having 128GB per node, totaling 313,344 GB

Topology: EDR Enhanced Hypercube

IB Switches: 224 units

Moving this system necessitates the engagement of a professional moving company. Please note the four (4) attached documents detailing the facility requirements and specifications will be provided. Due to their considerable weight, the racks require experienced movers equipped with proper Professional Protection Equipment (PPE) to ensure safe handling. The purchaser assumes responsibility for transferring the racks from the facility onto trucks using their equipment.

mk_stjames on 2024-04-29

Given that the individual nodes are just x86_64 Xeons and run linux... it would be interesting to part it out for sale as individual, but functional, nodes to people. There are a lot of people would like to have a ~2016 era watercooled 1U server from a supercomputer that was once near the top of the Top500 just to show to people.

Get little commemorative plaques for each one and sell for $200 each or so.

edit: it seems each motehrboard is a dual CPU board and so there are 4032 nodes, but the nodes are in blades that likely need their rack for power. But I think individual cabinets would be cool to own.

There are 144 nodes per cabinet... so 28 cabinets. I'd pay a fair amount just to own a cabinet to stick in my garage if I was near there.

electroly on 2024-04-29

The individual servers are not watercooled. The compute racks are air-cooled; the adjacent cooling racks then exchange that heat using the building's chilled water. It's the rack as a whole that is watercooled. If you extract a single node, you won't get any of that. As the other commenters also point out, these are blades; you can't run an individual node by itself.

w-ll on 2024-04-30

World of Warcraft sold decommissioned blades for about that much with no intention to be actually used. Just something to thru up in the cave.

chasil on 2024-04-29

These are blades, so there is probably some kind of container chassis required to run them.

Using them as desktop PCs would likely be a challenge.

fnord77 on 2024-04-29

I don't think there's that big of a market for obsolete server pieces as nostalgia...

But you could probably make a decent profit on just the CPUs alone parted out, even with the moving/handling costs.

RajT88 on 2024-04-30

Going off one listing for a E5-2697v4, $50 with free shipping, 386 already sold.

If you figure after the double-dipping of eBay/Paypal and then shipping fees, that's ~$30 profit per CPU.

8024 x 30 = 241,920 USD. Not too shabby for what's got to be some weeks/months of work. You could probably assume that they can sell or scrap the rest of it for a bit more as well, minus the fees for storage and moving company.

atlgator on 2024-04-30

I was thinking the same thing. Worst case just sell off the CPUs and RAM.

slavik81 on 2024-04-30

I have a couple servers with this exact CPU that I run for a mixture of practical and sentimental reasons. I bought them off eBay and only after purchase discovered they were a piece of history. They have a second life testing GPU libraries for Debian from a rack in my basement.

For privacy reasons, I won't say who originally owned the servers, but they had a cool custom paint job and were labelled YETI1 and YETI2. If the original owner is on HN, perhaps they will recognize the machines and provide more information.

https://slerp.xyz/img/misc/argo-lyra.jpg https://slerp.xyz/img/misc/argo-lyra-open.jpg

CalRobert on 2024-04-29

Is it not "Personal" protective equipment?

https://en.wikipedia.org/wiki/Personal_protective_equipment

op00to on 2024-04-29

You need PPE to protect your profession of moving stuff.

chasil on 2024-04-29

I can find a bunch of the E5-2697v4 CPUs on eBay in the $30-40 range.

I wonder if there is a market for the SGI hardware.

michaelt on 2024-04-29

So getting 8,064 of them for $3,085 - 38 cents per CPU - is great value for money!

jazzyjackson on 2024-04-29

this is basically "free grand piano" - not so free once you hire the movers and tuners

dylan604 on 2024-04-29

At least a piano doesn't require power and cooling to operate.

zdragnar on 2024-04-30

Not with that attitude...

kube-system on 2024-04-29

Dump 8,064 old processors on eBay and you'll probably introduce some downwards price pressure.

jeremyjh on 2024-04-29

That’s just the current bid and it hasn’t met the reserve.

slavik81 on 2024-04-30

The reserve price for the auction is $100k.

Locutus_ on 2024-04-29

There is, but really only for the MIPS hardware.

araes on 2024-04-30

If somebody has the money, and the resources required to house the system, seems like decent value for the money at the reserve (apparently $100k) The range of, I'd buy if I had the money and a valid use case. Even partial resale like suggested. There's an argument about being behind the "most" advanced. Yet, its a petascale server from 2016 that was good enough for the NSF. It's not exactly an old dog.

The stuff I could quickly find was $1M+ in individual sales (not sure about auction sites and used).

DDR4 RAM $734,500 @ $75 / 32 GB for 313,344 GB

CPUs $484,000 @ $60 / (1) E5-2697v4 for 8,064 units

Figure the hard drives must also be enormous, and probably a huge amount of storage, just not sure on the quantity, and they may not be sold with the system.

They're not kidding on the moving company, that's 10's of tons of computer to haul somewhere. 26 1U Servers @ 2500 lbs each. Couple semi-trucks, since the most they can usually do is 44,000 lbs on US highways. Not sure if the E-Cell weight was 14 @ 1500 lbs in addition to the 26 @ 2500 lbs?

araes on 2024-05-13

Apparently it ended up selling for the cost of the CPUs at $480,000. Also, guess my math was low, and it was a 47 ton computer.

https://cowboystatedaily.com/2024/05/07/cheyenne-supercomput...

unnouinceput on 2024-04-29

>...totaling 313,344 GB

Can you imagine the RAMDisk? Yes, you can. Especially in 20 years when it will be the norm. And also the Windows version that will require half of it in order to run /s

christkv on 2024-04-29

Does it come with a portable nuclear reactor to power it?

Workaccount2 on 2024-04-29

For those curious, Cheyenne is a supercomputer from 2016/2017 that launched on the 20th spot in the top500 super computers. It was decommissioned in 2023 after pandemic lead to a two year operation extension.

It has a peak compute of 5.34 petaflops, 313TB of memory, and gobbles 1.7MW.

observationist on 2024-04-29

In comparison, 18 A100 GPUs would have 5.6 petaflops and 1.4 TB vram, consuming 5.6 kw.

The speed of processing and interconnect is orders of magnitude faster for an A100 cluster - 1 8 gpu pod server will cost around $200k, so around $600k more or less beats the supercomputer performance (price I'm searching seems wildly variable, please correct me if I'm wrong.)

mk_stjames on 2024-04-29

The Cheyenne numbers are 5.34 petaflops of *FP64*.

The 5.6PF you quote for 18 A100's would be in BF16. Not comparable.

The A100 can only do 9.746 TFLOPS in FP64.

So you would need 548 A100's to match the FP64 performance of the Cheyenne.

observationist on 2024-04-29

Thanks, glad you guys caught that - could be generous and allow the tensor core tflops, since you'd more than likely be using a100 pods for something cuda optimized, in which case 19.5 tflops fp64 at peak per GPU, roughly 267 would be needed, or 34 pods, at $6.8 million, with 21.76 TB vram and 81 kw power consumption.

Double those for raw fp64.

latchkey on 2024-04-29

AMD MI300x is 163.4 TFLOPS in FP64.

33 of them, which would also have 6,336TB of memory.

I'll have way more than that in my next purchase order.

It is really fun to build a super computer.

mk_stjames on 2024-04-29

I'm an amateur, but I have code that I think could probably dispatch threads pretty efficiently on the Cheyenne thru it's management system simply because it's all xeons distributed. If I can run it on my personal 80-core cluster, I could get it to run on Cheyenne back then.

But hitting the roofline on those AMD GPGPU's? I'd probably get nowhere fucking close.

That is the thing that Cheyenne was built for. People doing CFD research with x86 code that was already nicely parallelized via OpenMPI or whathaveyou.

latchkey on 2024-04-29

It is wild how much compute has grown.

I put dual Epyc 9754 into my first box of MI300x.

That's 256 cores + 8x MI300x, in a single box.

Agreed, it is a great solution for CFD, which is definitely one workload I'd love to host.

dekhn on 2024-04-29

I used to build small clusters and use supercomputers and I can't imagine it's fun to build a super computer. It requires a massive infrastructure and significant employee base, and individual component failures can take down entire jobs. Finding enough jobs to keep the system loaded 24/7 while also keeping the interconnect (which was 15-20% of the total system cost) busy, and finding the folks who can write such jobs, is not easy. Even then, other systems will be constantly nipping at your heels with newer/cheaper/smaller/faster/cooler hardware.

latchkey on 2024-04-29

Thanks for the feedback. You make a lot of good points. I've built a 150,000 GPU system previously, but it was lower end hardware. It was a lot of fun to make it run smoothly with its own challenges.

It doesn't take a lot of employee's, we did the above on essentially two technical people. Those same two are working on this business.

Finding workloads/jobs is definitely going to be an interesting adventure, that said, the need for compute isn't going away. By offering hard to get hardware at reasonable rates and contract lengths, I believe we are in a good position on that front, but time will tell.

We are only buying the best of the best that we can get today. The plan is to continuously cycle out older hardware as well as not pick sides on one over another. This should help us keep pace with other systems.

dekhn on 2024-04-30

150K GPU with two people... presumably, 8 GPU/host, you had close to 20K servers.

I can't really see how that's achievable with only two people, given the time to install hardware, maintain it, deal with outages and planned maintainence and testing, etc. Note: I worked at Google and interfaced with hwops so I have some real-world experience to compare to.

Building a 150K GPU system without a well-understood customer base seems a bit crazy to me. You will either become a hyperscale, serve a niche, or go out of business, I fear.

latchkey on 2024-04-30

7 separate data centers all around the US.

12 GPU/host. 130,000 of that kind. ~10,833 hosts.

The ASRock BC-250's we deployed were 12 individual blades and those were all PXE booted. We deployed 20,000 of those blades across 2 data centers. This was a massive feat of engineering, especially during covid where I couldn't even access the machine directly. Built a whole dashboard to monitor it all too.

I know, I can't believe we did it either, but we did. Software automation was king. I built a single binary that ran on each individual host and knew how to self configure / optimize everything. Idempotently. Even distributing upgrades to the binary was a neat challenge that I solved perfectly, in very creative ways.

Today, we are starting much smaller. Literally from zero/scratch. Given the cost of MI300x, I doubt we will ever get to 150k GPUs, that's an absurd amount of money, but who knows.

dekhn on 2024-04-30

But who did the wiring? Even with blades which consolidate much of the cabling, there's still a tremendous amount of work to build the interconnect. On typical large systems I've seen a small team 3-5 guys working weeks+ to wire a modest DC.

latchkey on 2024-04-30

We'd hire the initial deployment out to temporary contractors. It just took a few weeks to get a large deployment out. The hard part was the 12 GPUs needed to be inserted at the DC, which took a bunch of effort. Once it was done we generally had 1-2 people on the ground in the data centers to deal with breakfixes. Either contractors or supplied by the DC.

For this venture, again, we are starting small, so we are just flying to the DC and doing it ourselves. There are also staff there that are technical enough to swap stuff out when we need it. The plan will be to just hire one of their staff as our own.

I don't think we will make it for this next deployment due to time constraints, but ideally in our near future, we will go full L11. Assemble and ship out full racks at the manufacturer/VAR, bolt em down, wire them up and ready to go. That is my dream... we will see if we get there. L11 is hard cause a single missing cable can hold up an entire shipment.

dekhn on 2024-05-02

I just realized we had this same conversation on HN before. IIRC I said last time and I'll repeat: if you say that you set up 15K GPUs with 2 people, and I ask who did the wiring, and you say an external company came in and spent a few weeks wiring the network for you, then you can't say that 2 people set up 15K GPUs. You're trying to externalize real costs (both time and money).

I understand your dream (having pursued similar ideas) but I think you have ot be realistic about the effort required, especially when you add picky customers to the mix.

latchkey on 2024-05-02

Now you're nit picking two days later, which is fine. Sure, you got me there!

Two people hired some temporary workers and asked them to perform a task to get us up and running, which lasted a few days, out of years of operation.

¯\_(ツ)_/¯

nickpsecurity on 2024-04-29

Also, supercomputers usually use general-purpose nodes supported by many standard tools, multiple methods of parallelization, and (for open standards) maybe multi-vendor. I imagine this one is much more flexible than A100’s.

jjtheblunt on 2024-04-29

also, comparing SIMD with cheyenne is misleading

martinpw on 2024-04-29

The supercomputer flops are FP64. The A100 stats you are using are FP16.

jeffbee on 2024-04-29

It's fine. We will simply run weather forecast in BF16 mode and hallucinate the weather.

dgacmu on 2024-04-29

Introducing our next supercomputer, Peyote.

adgjlsfhk1 on 2024-04-29

weather forecasting is actually moving to reduced precision. none of the input data is known to more than a few digits, and it's a chaotic system so the numerical error is usually dominated by the modeling and spacial discretization error

CamperBob2 on 2024-04-30

A black Sharpie marker is even cheaper...

Netcob on 2024-04-29

Aw man... I was going to use it for my homelab but that's 1696320W more than I can supply. Well... maybe if I use two plugs instead of one...

DonHopkins on 2024-04-29

Bet it runs warm. The cat will love sitting on it.

buescher on 2024-04-29

It was at #160 in 2023 when it was decommissioned.

Animats on 2024-04-29

It's really hard to find a home for large, old, high-maintenance technology. What do you do with a locomotive, or a Linotype? They need a support facility and staff to be more than scrap. So they're really cheap when available.

The Pacific Locomotive Association is an organization with that problem. About 20 locomotives, stored at Brightside near Sunol. They've been able to get about half of them working. It's all volunteer. Jobs that took days in major shops take years in a volunteer operation.

jasonwatkinspdx on 2024-04-29

At the ill fated Portland TechShop I took woodworking classes from a retired gentleman, who professionally was a pattern maker for molding cast metal parts. This made his approach to woodworking really interesting. He had a huge array of freestanding sander machines, including a disc sander with more than a yard diameter.

For anyone unfamiliar, pattern makers would make wooden model versions of parts that were to be cast in metal. The pattern would be used to make the mold. He could use these various sanding machines to get 1/64" precision for complex geometries. It was fascinating to watch how he approached things, especially in comparison to modern CNC.

His major project outside of teaching the classes? Making patterns for a local steam locomotive restoration project. He had all these wooden versions of various parts of a locomotive sitting around.

dekhn on 2024-04-29

Does 1/64" precision really mean anything in wood, where small fluctuations in air moisture can cause > 1/64" distortion? I guess it's OK if you stay within a climate controlled area.

jasonwatkinspdx on 2024-04-29

So he would build parts by first making an oversize rough blank of bonded layers of marine grade plywood in a big press. Then he'd rough cut it various ways on a big band saw. Then he'd work his way through using all the sanders to slowly approach the net shape. He used precision squares to measure bigger stuff and calipers for smaller stuff.

I can't tell you the exact stability of marine grade plywood, but I know it's about as good as you can get for a wooden material, and I doubt he'd go to the effort of such precise measurements if it didn't matter.

dekhn on 2024-04-30

Plywood is good for dimensional stability, but I'm pretty sure all this work must have been done in and around a toolroom with stable moisture content or the part was used immediately and then consumed/destroyed before it "moves" too much. However, he sounds like he's pretty knowledgeable so I'm going to guess this isn't just garage woodworking where 1/32" doesn't really matter when the wood is going to shrink/expand by 5-10% over the course of a year.

jasonwatkinspdx on 2024-04-30

So the patterns once done would be taken to a foundry where they'd be used to make molds. I'm not totally up to speed on that process but I know it involves surrounding the pattern with a combination of refractory sand and binder. Where it gets tricky is complex parts that have multiple cores and so on.

And yeah, this guy was retired at the time but he'd been doing it for like 50 years. I'm very sure he knew what did and did not matter.

guenthert on 2024-05-01

Funny, just yesterday I saw here in Germany a train where the locomotive was labeled 'rent me'. Apparently there's an organization which bought old locomotives (this particular one looked like 60s, maybe early 70s vintage) from the state-run former monopoly to rent them out (large industrial customers I'd think).

I was surprised to see such an old locomotive in operation, but apparently it's still good (i.e. economic) enough to shuttle cars from factory to port. Guess the air pollution restrictions aren't all that tough for diesel trains.

EDIT: didn't realize this is big business in Germany: https://railmarket.com/eu/germany/locomotive-rental-leasing

082349872349872 on 2024-04-29

Hence my simple theory of aesthetics:

— if the best stuff from then is still better than good stuff from now, it's art

— if the best stuff from then is worse than bad stuff from now, it's technology

kibwen on 2024-04-29

May I present my postmodern theory of aesthetics:

- If it's useful as a medium for money laundering, it's art.

- If it's useful as a facilitator for money laundering, it's technology.

h2odragon on 2024-04-29

> the system is currently experiencing maintenance limitations due to faulty quick disconnects causing water spray. Given the expense and downtime associated with rectifying this issue in the last six months of operation, it's deemed more detrimental than the anticipated failure rate of compute nodes.

Even the RAM has aged out...

Very hard to justify running any of this; newer kit pays for itself in reduced power and maintenance quick in comparison.

Tempest1981 on 2024-04-29

Are you trying to discourage others from bidding, so you can swoop in and win the auction?

h2odragon on 2024-04-29

nah, i already have far more junk computers than i need.

I lusted after a Cray T3E once that I coulda had for $1k and trucking it across TN and NC; but even then I couldn't have run it. I'm two miles away from 3 phase power and even then couldn't have justified the power budget. At the time a slightly less scrap UltraSPARC 6k beat it up on running costs even with higher initial costs so i went with that instead. I did find a bunch of Alphas to do the byte swizzling tho. Ex "Titanic" render farm nodes.

I've been away from needing low budget big compute for a while, but having spent a few years watching the space i still can't help but go "ooo neat" and wonder what i could do with it.

https://en.wikipedia.org/wiki/Cray_T3E

haunter on 2024-04-29

Never understood why can you bid below the reserve price, or more like why the reserve price is hidden because the whole point that they (the seller) have a price in mind they are not willing to go below.

freetime2 on 2024-04-29

> a price in mind they are not willing to go below

I worked for an auction, and sellers accepted bids below the reserve price all the time. They just want to avoid a situation where an item sells at a “below market” price due to not having enough bidders in attendance - e.g. a single bidder is able to win the auction with a single lowball bid. If they see healthy bidding activity that’s often sufficient to convince them to part with the item below reserve.

Reserve prices are annoying for buyers, but below-reserve bids can provide really useful feedback for sellers.

We even had full-time staff whose job was to contact sellers after the auction ended and try to convince them to accept a below-reserve bid, or try to get the buyer and seller to meet somewhere in the middle. This worked frequently enough to make this the highest ROI group in our call center.

ansible on 2024-04-29

It is playing on the psychology of the bidders. You want the bidders to be invested, to want to win the auction. To compete to win the prize.

Also, consider this: if the reserve is too high, and no one bids on it, then everyone looking at it is going to wonder what it is really worth. If there are several other bidders, then that gives reassurance to the rest for the price they each are bidding.

jeremyjh on 2024-04-29

Also it would give feedback to the seller that the reserve may not be feasible.

bombcar on 2024-04-29

It's entirely because of human nature - you want people to get invested in it, which having them bid any amount does.

It's the same reason an auction can go above the price/value of the thing, because you get invested in your $x bid, so $x+5 doesn't seem like paying $x+5, but instead "only $5 more to preserve your win" type of thing.

See penny auction scams - https://utahjustice.com/penny-auction-scams - for an extreme example.

tombert on 2024-04-29

Man, if I had the space, the money, and the means of powering it I would bid on this immediately. It's so damn cool and will likely end up selling for a lot less than its worth due to its size.

I've always been fascinated by the supercomputer space, in no small part because I've been sadly somewhat removed from it; the SGI and Cray machines are a bit before my time, but I've always looked back in wonder, thinking of how cool they might have been to play with back in the 80s and 90s.

The closest I get to that now is occasionally getting to spin up some kind of HPC cluster on a cloud provider, which is fun in its own right, but I don't know, there's just something insanely cool about the giant racks of servers whose sole purpose is to crunch numbers [1].

[1] To the pendants, I know all computers' job is to crunch numbers in some capacity, but a lot of computers and their respective operating systems like to pretend that they don't.

neilv on 2024-04-29

https://en.wikipedia.org/wiki/Cheyenne_(supercomputer)

> The Cheyenne supercomputer at the NCAR-Wyoming Supercomputing Center (NWSC) in Cheyenne, Wyoming began operation as one of the world’s most powerful and energy-efficient computers. Ranked in November 2016 as the 20th most powerful computer in the world[1] by Top500, the 5.34-petaflops system[2] is capable of more than triple the amount of scientific computing[3] performed by NCAR’s previous supercomputer, Yellowstone. It also is three times more energy efficient[4] than Yellowstone, with a peak computation rate of more than 3 billion calculations per second for every watt of energy consumed.[5]

nickpsecurity on 2024-04-29

My favorite part of SGI computers, like Altix and UV lines, was the NUMA memory with flexible interconnect. NUMA let you program a pile of CPU’s more like a single-node, multithreaded system. Then, the flexibility let you plug in CPU’s, graphics cards, or FPGA’s. That’s right into the low-latency, high-speed, memory bus.

There was a company that made a card that connected AMD servers like that. I don’t know if such tech ever got down to commodity price points. If you had Infiniband, there were also Distributed, Shared Memory (DSM) libraries that simulated such machines on clusters. Data locality was even more important then, though.

formerly_proven on 2024-04-29

Cray XT/SeaStar? iirc the interconnect ASIC pretends to be another peer CPU connected via HyperTransport. HPE Flex is similar, but works via QPI/UPI for Intel CPUs.

nickpsecurity on 2024-04-29

It was NUMAscale. It’s mentioned in this article with some others for comparison:

https://www.nextplatform.com/2015/07/16/what-if-numa-scaling...

jonhohle on 2024-04-30

Despite being SGI branded hardware, this was after the name was bought by Rackable and IIRC, SGi hardware became essentially rebadged Rackable hardware. Still cool, but not as cool as some custom MIPS hardware from old SGI.

fancyfredbot on 2024-04-29

It's just not economical to run these given how power inefficient they are in comparison to modern processors.

This uses 1.7MW, or $6k per day of electricity. It would take only about four months of powering this thing to pay for 2000 5950X processors. Those would have a similar compute power to the 8000 Xeons in Cheyenne but they'd cost 1/4 the power consumption.

nerpderp82 on 2024-04-29

If you can get 1.7MW service, then you are paying utility rates, or around 100-150 per MWh, or as you quoted 4-6k per day. In Seattle, running this off peak would cost one about 104$/hr before the other fees.

I would be neat if a subset of this could be made operational and booted once in awhile a computer history museum. I agree that it doesn't make sense to actually run it.

https://seattle.gov/city-light/business-solutions/business-b...

voytec on 2024-04-29

> It took us fifteen years and three supercomputers to MacGyver a system for the gate on Earth

    Samantha Carter

techplex on 2024-04-29
isodev on 2024-04-29

I was just thinking how much of deep space radio telemetry this super computer must have seen.

brianhorakh on 2024-04-29

This is Wopr! How about a nice game of chess?

bibliotekka on 2024-04-29

the only winning move is not to play

queuebert on 2024-04-29

I wonder who buys these. Crypto miners? My institution would make it nearly impossible to buy a secondhand supercomputer.

bragr on 2024-04-29

There's a whole sub-industry of people bidding on government auctions in order to part out the stuff. I'd be pretty surprised if the whole cluster got reassembled. But people on a budget will buy those compute nodes, someone trying to keep their legacy IB network will snap up those switches, the racks, etc.

gabrielhidasy on 2024-04-29

r/homelab will have a field-day getting those nodes up, some people will want just one for practicality, some people will want at least a couple and a IB switch just for the novelty of it.

jeffbee on 2024-04-29

I can't imagine anyone from r/homelab has an SGI 8600 E-cell laying around that they could slap these blades into.

gh02t on 2024-04-29

I can imagine it, some people on there are ridiculous, but yeah in my experience these supercomputer nodes are a lot more integrated/proprietary than most standard server hardware. It's not straightforward to just boot one up without all the support infrastructure. I'd assume they'd mostly be torn down and parted out.

bombcar on 2024-04-29

You might be surprised - because they're pretty custom they are often "more open" than you might expect; as long as you have the connectors you can often get things running something. Sometimes they have bog-standard features present on the boards, just not enabled, etc.

It's the commoditized blade servers, etc that are stripped down to what they need to run and nothing more.

gh02t on 2024-04-29

Oh I'm speaking from experience with the SGI supercomputer blades. They're pretty wacky, 4x independent, dual cpu boards per blade and all sorts of weird connectors and cooling and management interfaces. Custom, centralized liquid cooling that requires a separate dedicated cooling rack unit and heat exchanger, funky power delivery with 3 phase, odd networking topologies, highly integrated cluster management software to run them etc. I'm not sure if they have any sort of software locks on top of that, but I would bet they do and presumably NCAR wipes all of them so you likely won't have the software/licenses.

I dug up a link to some of the technical documentation https://irix7.com/techpubs/007-6399-001.pdf . Probably someone can get it working, but I imagine whoever is going to go through the hassle of buying this whole many-ton supercomputer is planning to just strip it down and sell the parts.

bombcar on 2024-04-29

Yeah the licensing is often the stumbling block, unless you can just run some bog-standard linux on it. It sounds like this might be custom enough that it would be difficult (but I daresay we'll see a post in 5 years from someone getting part of it running after finding it on the side of the road).

gh02t on 2024-04-29

Ultimately SGI was running Linux and AFAIK the actual hardware isn't using any secret sauce driver code, so yeah if you can get it powered on without it bursting in flames and get past the management locks you can probably get it working. It's definitely not impossible if you can somehow assemble the pieces.

lawlessone on 2024-04-29

>Crypto miners?

I think mining crypto with these would burn far too much energy compared to the ASICS in use.

bufferoverflow on 2024-04-30

Not every crypto can be mined with ASICs. For example, Monero is ASIC and GPU resistant.

latchkey on 2024-04-29

Crypto is no longer mined commercially with GPU type compute. When ETH switched to PoS, it decimated the entire GPU mining industry. It is no longer profitable. The only people doing it now are hobbyists.

bufferoverflow on 2024-04-30

Some crypto is ASIC- and GPU-resistant.

Monero is one of them.

ceinewydd on 2024-04-30

Sure, but you can get (much) better price-performance-power out of a CPU which isn't approaching a decade old, when mining ASIC and GPU-resistant cryptocurrencies like Manero. I don't know it'd be worth the effort to buy E5-2697-v4 CPUs which are running in such a specialized configuration over and above AMD Ryzen or EPYC CPUs in commodity, inexpensive mainboards.

bufferoverflow on 2024-05-01

Depends on the price this supercomputer sells at.

latchkey on 2024-04-30

Monero is pegged to CPU.

I don't know why you downvoted me.

https://www.getmonero.org/resources/moneropedia/randomx.html

h2odragon on 2024-04-29

Recyclers.

sambull on 2024-04-29

any I've dealt with definitely wouldn't touch the 'you need to hire professional movers costing you $10k's of dollars to get it out of the facility' stipulation - they seem to prefer the 'where's the location of the storage shed' situation.

vel0city on 2024-04-29

There's 8,064 E5-2697v4's in this. Those go on ebay for ~$50/ea. That's $400,000 of just CPUs to sell.

If the winning bid is $100k, you spend $40k to move it out of there, another $10k warehousing it while selling everything on ebay, and you're still up $250k on the processors alone.

EvanAnderson on 2024-04-29

> That's $400,000 of just CPUs to sell.

But do you crater the market for those CPUs? What's the demand for 2016-era Xeons and how much of their price comes down to supply?

ansible on 2024-04-29

I presume no one is building new motherboards for those processors either. While there is old stock laying around, you really need to run those systems close to as-is for them to be useful.

pantalaimon on 2024-04-29

> I presume no one is building new motherboards for those processors either

That's actually far from the truth, LGA2011 is quite popular as a budged gaming system precicely because CPUs are so cheap on the 2nd hand market.

https://aliexpress.com/w/wholesale-X99.html

toast0 on 2024-04-29

These are high spec cpus for the socket though. Lots of room for people with compatible boards that want to upgrade.

There's a lot of low budget hosting with old Xeon systems (I'm paying $30/month for a dual westmere system; but I've seen plenty of offers on newer gear); you can still do a lot with an 18 core Broadwell, if the power is cheap.

alchemist1e9 on 2024-04-29

And how much labor costs to earn that $250K? Once that is factored in, I’m guessing fair price is zero or negative.

Plus knowing a bit about warehouse costs … your $10K is a bit on the low side don’t you think?

bombcar on 2024-04-29

It's for the processors alone - a scrapping company dedicated to this stuff would be able to actualize more from other components - and they often have warehouse space available that they already own.

Let's come back and see if the auction failed; I doubt it will.

jfkfif on 2024-04-29

Academic departments with low budgets and cheap electricity who can make due with old CPUs

hggh on 2024-04-29
blakespot on 2024-05-02

The comments in some of the news posts covering the auction popping up are curious / amusing. Lots of supercomputer hate - "let's use a few PCs," etc.

ex: https://wccftech.com/iconic-5-34-pflops-cheyenne-supercomput...

fnord77 on 2024-04-29

> 8,064 units of E5-2697v4

Those alone are selling for about $40 on ebay.

Let's say you sold them for $15 each.

$120,000. Let's say the auction and the moving and the break down costs were $20,000.

maybe worth it?

bufferoverflow on 2024-04-30

You won't find 8064 buyers of these CPUs any time soon. It will take many years, maybe a decade to offload.

For example, in April ebay sold 79 of them. You're looking at ~100 months. And that's assuming constant demand.

ceinewydd on 2024-04-30

I ended up reading all the documentation provided alongside the auction because I was debating making a bid. I decided not to in the end.

In total, Cheyenne weighs in around ~95000lbs in weight, they require it be broken down in <= 3 business days on site and only offer 8AM - 4PM access. Therefore you'll need to support unbolting and palletizing all the cabinets for loading into a dry trailer, get a truck to show up, load, then FTL to the storage (or reinstall) location of your choosing for whatever's next.

Requirements for getting facility access seem to mean you bring both labor and any/all specialized equipment including protective equipment to not damage their floors while hauling away those 2400lb racks. Liability insurance of $1-2M across a wide range of categories (this isn't particularly unusual, or expensive, but it does mean you can't just show up with some mates and a truck you rented at Home Depot).

I'd guess you're looking at more like $25k just to move Cheyenne via truck, plus whatever it sells for at auction, unless you are located almost next door to them in Wyoming. Going from WY to WA the FTL costs alone would be $6-7k USD just for the freight cost (driver, dry trailer and fuel surcharges), add a bit if your destination doesn't have a loading dock and needs transloading for the last mile to >1 truck (probably) equipped with a tail lift. All the rest is finding a local contractor with semi-skilled labor and equipment to break it down and get it ready to load on a truck and covering the "job insurance".

Warehouse costs if you're breaking it for parts won't be trivial either and you'd be flooding the used market (unfavorably to you) were you to list 8000 CPUs or 300TB DDR4; it could take months or even to clear the parts without selling at a substantially depressed price.

It will take probably several thousand hours in labor to break this for parts assuming you need to sell them "Tested and Working" to achieve the $50/CPU unit pricing another commenter noticed on eBay, and "Untested; Sold as Seen" won't get anywhere near the same $/unit (for CPUs, DRAM, or anything else) and so even assuming $25/hr for fully burdened relatively unskilled labor, you could well be talking up to $100k in labor to break, test and list Cheyenne's guts on eBay or even sell onward in lots to other recyclers / used parts resellers. I don't think I could even find $25/hr labor in WA capable and willing of undertaking this work, I fear it'd be more like $45-60/hr in this market in 2024 (and this alone makes the idea of bidding unviable).

A lot of the components like the Cooling Distribution Units are, in my opinion, worth little more than scrap metal unless you're really going to try and make Cheyenne live again (which makes no sense, it's a 1.7MW supercomputer costing thousands of dollars per day to power, and which has similar TFLOPs to something needing 100x less power if you'd bought 2024-era hardware instead).

Anything you ultimately can't sell as parts or shift as scrap metal you are potentially going to have to junk (at cost) or send to specialist electronics recycling (at cost); your obligations here are going to vary by location and this could also be expensive -- the leftovers.

If anyone does take the plunge, please do write up your experiences!

jeffbee on 2024-04-29

I wonder what the point of liquid cooling such a system was. Were they pressed for space?

michaelt on 2024-04-29

The heat's going to be leaving the building in liquid-filled pipes, however you architect it. And with 1.7MW of peak power consumption, a nontrivial amount of liquid.

It's just a question of whether you want to add air and refrigerant into the mix.

It seems they're decommissioning it partly due to "faulty quick disconnects causing water spray" though, so an air cooling stage would have had its benefits...

toast0 on 2024-04-29

> The heat's going to be leaving the building in liquid-filled pipes

In the right climate, and the right power density, you can use outside air for cooling, at least part of the time. Unlikely at this scale of machine, but there was a lot of work towards datacenter siting in the 2010s to find places were ambient cooling would significantly reduce the power needed to operate.

calaphos on 2024-04-29

Has been really common in HPC for quite a while. I presume the higher interconnect/network of hpc favour the higher density of liquid cooling. Hardware utilization is also higher compared to normal datacenters, so the additional efficiency vs air cooling is more useful.

convolvatron on 2024-04-29

for large machines the air setup is really less efficient and takes up alot of space. you end up building a big room with a pressurized floor which is completely ringed by large ac units. you have to move alot of air through the floor bringing it up through the cabinets and back through to the acs. its also a big control systems problem, you need to get the air through the cabinets evenly, so you need variable speed fans or controlled ducts..and those need to be adaptive but not oscillate.

with a water cooled setup you can move alot more heat through your pipes just be increasing flow rate. so you need pumps instead of fans. and now your machine room isn't a mini-hurricane, and you can more flexibly deal with the waste heat.

Galatians4_16 on 2024-04-29

Built in a bunker under a mountain, so reduced airflow, plus need to hide heat signatures from outside surveillance?

Also, likely they had infrastructure available, from the nuclear power they use.

kimmeld on 2024-04-29

Cheyenne Wyoming not Cheyenne mountain.

eptcyka on 2024-04-29

Doesn't matter what conductor you use to move heat, the same amount of energy will have to be dispersed. And watercooling just implies more intermediate steps between the heatshield of the die and air. So I don't believe the heat signatures can really be helped.

jeffbee on 2024-04-29

This is a weather supercomputer, not a defense one.

saalweachter on 2024-04-29

So, people with experience moving this sort of hardware--

Let's say you just wanted to have this transported to a warehouse. How much are we talking, between the transportation cost and the space to store it?

gautamcgoel on 2024-04-29

The listing says that 1% of the nodes have RAM with memory errors. I assume this means hard errors since soft errors would just be corrected. Is this typical? Does RAM deteriorate over time?

dunham on 2024-04-29

Reading the whole paragraph, it sounds to me like they were accepting 1% errors rather than fixing the leaking cooling system.

bastardoperator on 2024-04-29

Donate it to https://computerhistory.org/, assuming they want it.

monocasa on 2024-04-29

It'd be great if this could end up in the hands of some group like the Living Computers Museum.

pnw on 2024-04-29

Unfortunately LCM closed during the pandemic and laid off all their staff, with no sign of reopening.

seaourfreed on 2024-04-29

The only problem is that the super computer keeps outputing "WANT TO PLAY A GAME?"

simonerlic on 2024-04-29

A strange game; the only winning move is not to play

NickC25 on 2024-04-29

What could someone possibly do with this? It's cool as hell but 8 years old.

layoric on 2024-04-30

I dunno, I still think the 2011-v3 platform that these Xeons can run in is still a great setup for a homelab. A bit power hungry but if you can build a dual core workstation with 36 cores, and 256GB ram for <$1000, that is a solid server. Sure, a bit more power hungry, but that would still make for a hell of an app/db server. That's basically an m4.16xlarge, same CPU generation, platform etc, (yes, without all the surrounding infra) that will cost you something like $0.10 per hour worth of energy to run.

Take the Dell T7910 for example (I use one of these for my homelab), you can pick up a basic one with low end CPU/RAM for sometimes as little as $300. Dumping all these 18 core E5s and DDR4 ECC on the market should make it even cheaper to spec out. Currently they go for about $100-150 each on the CPUs, and ~$150-200 for the RAM. Not bad IMO.

neilv on 2024-04-29

Hard to see in the low-res photos, but is that storage from Supermicro?

humansareok1 on 2024-04-29

What's this thing actually worth? Current bid is ~3k.

freedomben on 2024-04-29

Remember that the buyer has to move it. If that costs $50K (no idea, totally guessing) then it's currently "worth" $53K

HeyLaughingBoy on 2024-04-29

Hang around for the winning bid and you'll see what it's worth then.

humansareok1 on 2024-04-29

Auctions by default almost always undercut the actual market value so no not really?

HeyLaughingBoy on 2024-04-29

My answer was a bit tongue-in-cheek, but the reality is that it depends on what you mean by "market value."

For the market of the auction, the selling price is the actual market value. Likewise, it's typically not too far off the value of the item in the wider market, assuming you are comparing it to a similar item in similar condition. The problem is that for most items purchased at auction, there's no similar item, readily available, to compare it to.

I've won multiple items at machine-shop auctions for a small fraction of their "new" price. The problem with the comparison is that e.g., the Starrett dial test indicator that I got for $10, and the new one that retails for around $200 are hard to compare because there's no liquid market for 30-year-old measuring equipment. While it's adequate for my hobby machinist use, it wouldn't be acceptable in a precision shop since it has no calibration history.

If you find an item where you can reasonably compare apples to apples, e.g., a car, you see that the final price of a car at auction is usually pretty close to the price of the same make/model being sold on the open used market. The slightly lower price of the auction car reflects the risk of the repairs that might be needed.

bombcar on 2024-04-29

It's exactly in this "now vs later" that resellers and other brokers sit. If they know that X will sell for $Y "eventually" and how long that eventually is, they can work out how much they can pay for it now and still come out ahead.

Cars are very liquid and move quickly, so the now vs later price is close; weird things that nobody has heard of (but when they need it, they need it NOW) will have a much wider variance.

organsnyder on 2024-04-29

Isn't the winning bid the actual market value, by definition?

toast0 on 2024-04-29

Depends on the terms of the auction. If we take the California legal definition of Fair Market Value for real estate:

> The fair market value of the property taken is the highest price on the date of valuation that would be agreed to by a seller, being willing to sell but under no particular or urgent necessity for so doing, nor obliged to sell, and a buyer, being ready, willing, and able to buy but under no particular necessity for so doing, each dealing with the other with full knowledge of all the uses and purposes for which the property is reasonably adaptable and available.

A 7 day auction on a complex product like this may be a little short to qualify with the necessity clauses, IMHO; there's a bit too much time pressure, and not enough time for a buyer to inspect and research.

humansareok1 on 2024-04-29

I think Auctions exist explicitly to potentially buy or sell an Item with a delta on it's Market Value? I.e. Buyers want the chance to buy below and sellers to sell above. Neither really wants to engage in the transaction at all in the reverse situation or even in the "Market Value" case. You would just make a direct sale and avoid the hassle of an auction.

jtriangle on 2024-04-29

The market value of something is what someone is willing to pay for it.

Always has been, always will be.

9front on 2024-05-07

Sold for $480,085.

bketelsen on 2024-04-29

Imagine a Beowulf cluster of these.

Sorry I couldn't resist.

Isamu on 2024-04-29

You forgot to say First Post!

mindcrime on 2024-04-29

[flagged]

whalesalad on 2024-04-30

Will it run Crysis?

RobotToaster on 2024-04-29

Is this what they used to run the stargate?

monocasa on 2024-04-29

This is from Cheyenne, Wyoming, not Cheyenne Mountain (in Colorado Springs).