Memory problems galore…
SPRAGGETT ON CHESS
My pc has a hard disk of 450 giga bytes, and most of that is just empty, unused space. My chess data-bases fill approximately 2.0 giga. My favourite music files and downloads fill another 36 giga. My operating system and continual updates another couple giga….that still leaves me with more space than I know what to do with! I have a headache just thinking what 450 giga is…
But there are a number of projects/systems in the world that have real memory problems: they crunch so much information everyday that the best minds on the planet have had their own headaches on how to get a grip on it…technology has erased the physical obstacles to creating huge memory systems while the programs that manage all of that memory need to be super-human…
How to Handle the World’s Largest Digital Images
A new telescope will use the world’s largest digital camera to capture 20 terabytes of image data every day.
Much has been made of the ‘unprecedented’ scale of the IT infrastructure required to store all the data coming from the world’s largest physics experiment – the Large Hadron Collider (LHC) near Geneva, Switzerland.
SPRAGGETT ON CHESS
A robot manages some of CERN’s 50PB of tape storage.
LHC computing grid pushes petabytes of data, beats expectations
http://arstechnica.com/science/news/2010/08/lhc-computing-grid-pushes-petabytes-of-data-beats-expectations.arsBy John Timmer
The LHC isn’t simply the most powerful particle accelerator ever created. Handling the huge amounts of data it produces has required the creation of one of the biggest computer grids on the planet. The planning and testing of the compute facilities has been taking place for years, but it’s only recently that the grid has had to deal with the output from actual collisions. How did it do? “From the IT perspective, we didn’t notice when the beams came on,” said CERN’s Wolfgang von Rueden, “We had tested it with much higher throughput conditions.”
Still, not everything is working quite according to plan. von Rueden said that the initial expectations for the LHC’s computing grid had anticipated lower network performance and a reliance on tape; instead, the network has made it easier to shuffle large data sets between compute centers, and the price and performance of hard drives have turned out better than expected. Von Rueden gave us a brief overview of the computing setup at CERN, what they’ve learned from putting everything in place for the LHC, and how some major companies are relying on CERN’s experience to improve their products.
Although CERN has some significant computing resources, they contribute only about 10-15 percent of the total CPU power dedicated to the data coming out of the LHC, and most of the work on-site is dedicated to cutting down on the data that has to be stored. The LHC produces far more data than we can possibly store, and most of the collisions produce a spray of mundane particles we’re already aware of. A significant amount of computing power—about 2,000 CPUs at each of the four detectors—is dedicated to filtering the interesting collisions out of the background.
(One physicist compared this process to identifying traffic accidents when given footage of an intersection where the traffic light changes colors millions of times between incidents.)
The filtering reduces the flow of data from a petabyte (PB) a second to a gigabyte per second, which is then transferred from the detectors to the main compute facility via a dedicated 10Gbps connection. Once there, it needs to be stored, and von Rueden told us that the initial plan had been to use tape for that; right now, they have about 50PB of tape storage, handled by a set of robotic storage hardware. Still, they’ve been finding that disk storage is working well, and have scaled that up to 20PB worth of storage.
The disks are managed as a cloud service, with arrays of drives hooked up to Linux boxes in JBOD (Just a Bunch Of Disks) mode. Each of these boxes gets a 1Gbps connection, and any CPU in the storage cluster can read data from any of the disks.
One of the reasons for the increased reliance on disks is the network that connects the global grid to CERN. “Because the networking is going so well, filling the pipes can outrun tapes,” von Rueden told Ars. Right now, that network is operating at 10 times its planned capacity, with 11 dedicated connections operating at 10Gbps, and another two held in reserve. Each connection goes to one of a series of what are called Tier 1 sites, where the data is replicated and distributed to Tier 2 sites for analysis. Von Rueden said that the fiber that powers this setup has been “faster, cheaper, and more reliable than in planning.”
The original plan had been that each of the Tier 2 sites would keep a specific subset of the LHC data, and analysis jobs (which, being code, should be relatively compact) would be sent across the network to wherever the data resides. Instead, it’s turned out that the network performs so well that the data can be streamed anywhere on the grid in real time, which has made things significantly more flexible.
Supporting users and companies
Although the LHC’s computing needs are fairly unique, CERN’s IT staff faces issues that would sound familiar to anyone in corporate IT. Its datacenter is 35 years old, and is now hosting clusters in a space that was intended to support supercomputers. Von Rueden said that density is becoming a problem, and the building isn’t designed to take advantage of environmental services or do anything with its waste heat. Although there are some fixes that can be applied—several racks of critical equipment in the datacenter are now water cooled—most of the focus for CERN has been on getting the most performance per Watt used for computing.
In recent years, improvements in that area have come fast enough that CERN no longer bothers with hardware support contracts longer than three years. Machines just get run until they’re dead; the cost of the replacement, and the power savings replacement hardware brings, means it doesn’t make economic sense to keep the hardware going.
Beyond the hardware, CERN has to provide support to a very diverse user community, as physicists generally come to the site with their own laptop, and many of them run what von Rueden called “non-mainstream platforms.” It runs a huge wireless network, and has to provide services for visitors, from members of the press to dignitaries. Right now, CERN servers support over 20,000 e-mail users, and provide remote access to files and the internal network for many users.
Combined with the LHC’s unique computing needs, all of this makes CERN an excellent testbed for many technology companies. CERN runs a number of informal collaborations with industry but, for its IT problems, companies are encouraged to take part in the openlab program, which von Rueden runs. Right now, HP, Intel, Oracle, and Siemens all have openlab projects, in which they pay for dedicated staff at CERN and coordinate their work with their own employees.
SPRAGGETT ON CHESS