|   |
| 1.1 | Michigan ATLAS Group Computing |
The University of Michigan ATLAS Group is one of the largest groups in the ATLAS Collaboration and, as such, has made major contributions to the construction and commissioning of the ATLAS detector. We expect to play a leading role in calibrating and validating the performance of the ATLAS muon system during the era of LHC operations. Our ultimate goal, of course, is to participate fully in the ATLAS physics analysis with particular emphasis on W/Z/Higgs events with final-state muons.
The basic ATLAS computing philosophy is based on a grid-linked system of tiered computing centers. This philosophy has emphasized the leveraged acquisition of computing hardware and operational personnel with little regard for mission-oriented computing activities. While our group, along with the ATLAS group at Michigan State, has submitted a strong Tier-2 proposal to US ATLAS there is no guarantee that we will be selected. As responsible and dedicated scientists, we must plan to implement significant ATLAS computing resources, independent of the outcome of future Tier-2 competitions. We count on our funding agencies to assure that a major group such as our Michigan ATLAS group will be properly supported in acquiring and operating the necessary computational facilities. The idea that a group would make major detector contributions and be denied the opportunity to participate fully in the physics analysis of the experiment is fundamentally unacceptable.
We have continued to utilize the HYPNOS cluster acquired last year (now called UMROCKS) to undertake significant simulation activities. UMROCKS is described in detail in the section below. During the last year we have had to deal with UMROCKs age. A significant amount of effort has gone into repair and replacement of failing components. The details of this plan are also described below as well as our prototyping of new replacement equipment to support our efforts.
| 1.1.1 | Description of UMROCKs Cluster and Prototype Computing Facility | |
The UMROCKS computing cluster, formerly an NPACI resource, has been transferred to the Physics department into an appropriately modified laboratory space (2268 Randall). The UMROCKS cluster provides ~100 nodes of dual AMD Athlon MP 2000 (and 2400) processors, 2 GB of RAM per node, two 100 GB hard disks per node, a 1 TB RAID array and a FastEthernet switch for interconnections. While this cluster is over three and a half years old it is still a powerful computing resource capable of running ATLAS simulation, reconstruction and analysis jobs.
UMROCKS has given us extensive experience in system and cluster management. We have learned a number of lessons about reliability and robustness that we have integrated into our operations and future equipment planning. Understanding how best to organize and manage storage, networks and computing resources is critical to delivering a successful infrastructure to support our ATLAS work.
Because of the age of the UMROCKS nodes and the ever increasing demand for computational power and storage required to meet our ATLAS computing needs we are planning to acquire and deploy equipment in constant dollars per year. The University has been very supportive of our needs and is doubling our DOE computer equipment funding to help us acquire these resources. This model provides newer, more capable equipment each year, both to meet expanding demands as well as to replace outdated or failing equipment. By spreading out the purchases we are able to reap the benefits implicit in Moore’s law increases in capability for constant dollar amounts.
The following table shows the results of spending $150K/year. In the first year this provides 16 dual processor nodes (current equivalent is dual dual-core Opteron 285, 1U rack mount, 8 GB RAM), four 5U "storage" nodes (disk servers, 11 TB/server in FY2006) as well as funds for parts (materials and supplies).
The following table lists the capability we will acquire. Note for storage we list the amount acquired each year in parenthesis after the integrated total.
Nodes
|
Node(TB)
|
Si2K
|
Disk(TB)
|
StorNodes
|
|
| 2006-7 | 16
|
12
|
100
|
44
|
4
|
| 2007-8 | 32
|
25
|
226
|
106.2(62.2)
|
8(4)
|
| 2008-9 | 48
|
43
|
391
|
194.2(88)
|
12(4)
|
| 1.1.2 | Michigan-MSU Tier-2 proposal | |
As mentioned above, our Michigan ATLAS group along with the Michigan State ATLAS group has proposed to be one of the final two US ATLAS Tier-2 sites. We have leveraged over $4M in institutional support over 5 years to deliver a strong proposal to US ATLAS.
If we are selected Shawn McKee would be Co-Director (at 50% time) and Bob Ball would be the joint Tier-2 Manager (at 100% time).
Our Tier-2 would be based upon the same type of systems outlined above and would benefit from excellent networking, strong institutional support and significant local computational, grid, simulation and analysis expertise. Because of MiLR (Michigan Lambda Rail) our Tier-2 site would be unique in having redundant 10 Gigabit Ethernet links between our distributed Michigan-MSU sites and Chicago.
| 1.1.3 | Grid and Network Research and Development at Michigan | |
Michigan has played a significant role in networking and grids for ATLAS. Shawn McKee is the US ATLAS networking manager, co-Chairs the Internet2 High-Energy and Nuclear Physics (HENP) Working Group and is a founding co-Chair of a new Open Science Grid Networking Technical Group. From 2002-2005 he was the technical lead on the NSF Middleware Initiative (NMI) testbed at Michigan and is continuing to participate in SURAGrid. He is chairing the LHC-OPN monitoring effort which is focused on providing monitoring and measurement of the network connecting the world-wide LHC Tier-1 centers with CERN (Tier-0). He is also part of the Michigan MGRID project which is developing grid middleware and techniques for easy access to distributed grid resources. Perhaps the most impact for ATLAS in the grid and network area will come from three newly funded projects on which McKee is Co-PI:
| · | UltraLight: A $2 Million NSF ITR funded by MPS (Mathematical and Physics Sciences) which is exploring advanced networking infrastructure in support of LHC scale physics. In addition to a number of CMS collaborators, Michigan, Brookhaven and SLAC are participants. The final two years of UltraLight are focused on integrating new capabilities into the computing models of ATLAS and CMS from this project. | |
| · | TeraPaths: A DoE/MICS funded project at BNL, with participation of Michigan, concentrating on developing MPLS/QoS capabilities for the Tier 1, Tier 2 and eventually Tier 3 computing centers for LHC. This work is being directly integrated into the Tier 1 efforts and we are closely coupled to ESNet and other HEP sites. | |
| · | GridNFS: A $1.2 Million NSF NMI Development project to create a “grid” aware version of NFS (Network File System) based upon the newly developed NFS V4 standard. We have an agreement to test our project within OSG and plan to work closely with the OSG Storage working group and CERN to make GridNFS accessible to LHC projects. | |