Skip to content
This repository was archived by the owner on Apr 16, 2020. It is now read-only.
This repository was archived by the owner on Apr 16, 2020. It is now read-only.

CERN #15

@ghost

Description

http://opendata.cern.ch

CERN is, since the end of 2014, serving some fraction of the colossal amount of captured data about particle collision in LHC (with detectors like CMS, ATLAS, ALICE) - summing up to 60,000,000 GB.

Through the help of a small Python crawler, I've compiled an index of all CMS-detector primary datasets (all .root files totaling cca. 27,4TB). Also index of indexes. Other detector indexes of datasets + derivative datasets to come :)

  • CMS
    • Primary datasets from 2010 runs (27,4TB)
      • Scrap data from CERN's OpenData (via cmspull.py)
      • Compile index of all primary dataset files (.root)
      • Somehow get those 28TB into IPFS (maybe in cooperation w/ CERN? - than all steps are unnecessary)
  • ATLAS
  • ALICE
  • LHCb

To use all that data, a special environment is required - normally CERN's OpenData is recommending the use of their CernVM, which is basically Scientific Linux + ROOT, a data analysis framework (therefore the .root files). Without ROOT, this historical milestones cannot be used as computable data directly - so the tool must also be, as the collision data, preserved/archived. There's also a mirror right here at Github.

Oh, and thanks for the amazing project!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions