Shawn McKee
Shawn McKee / University of Michigan
/ University of Michigan
USATLAS Tier2 Meeting
USATLAS Tier2 Meeting
August 17, 2006
August 17, 2006 -
- Harvard
Harvard
UltraLight Status Report
UltraLight Status Report
Reminder:The
Reminder:The UltraLight Project
UltraLight Project
UltraLight is
UltraLight is
A four year $2M NSF ITR funded by MPS.
A four year $2M NSF ITR funded by MPS.
Application driven Network R&D.
Application driven Network R&D.
A collaboration of
A collaboration of BNL, Buffalo, Caltech, CERN,
BNL, Buffalo, Caltech, CERN,
Florida, FIU, FNAL, Internet2, Michigan, MIT, SLAC,
Florida, FIU, FNAL, Internet2, Michigan, MIT, SLAC,
Vanderbilt.
Vanderbilt.
Significant international participation: Brazil, Japan,
Significant international participation: Brazil, Japan,
Korea amongst many others.
Korea amongst many others.
Goal:
Goal: Enable the network as a managed resource.
Enable the network as a managed resource.
Meta
Meta-
-Goal:
Goal: Enable physics analysis and discoveries
Enable physics analysis and discoveries
which could not otherwise be achieved.
which could not otherwise be achieved.
Status Update
Status Update
There are three areas which I want to make
There are three areas which I want to make
note of for the Tier
note of for the Tier-
-2s
2s
1.
1.
Work on new UltraLight kernel
Work on new UltraLight kernel
2.
2.
Development of VINCI/LISA/
Development of VINCI/LISA/Endhost
Endhost
agents (US ATLAS test of this in Fall
agents (US ATLAS test of this in Fall…
…)
)
3.
3.
Work on FTS (either with FTS
Work on FTS (either with FTS
developers or as an equivalent project)
developers or as an equivalent project)
4.
4.
…
…and one addendum on US LHCNet
and one addendum on US LHCNet…
…
UltraLight Kernel Development
UltraLight Kernel Development
Having a standard tuned kernel is very important for a
Having a standard tuned kernel is very important for a
number of UltraLight activities:
number of UltraLight activities:
1.
1.
Breaking the 1 GB/sec disk
Breaking the 1 GB/sec disk-
-to
to-
-disk barrier
disk barrier
2.
2.
Exploring TCP congestion control protocols
Exploring TCP congestion control protocols
3.
3.
Optimizing our capability for demos and performance
Optimizing our capability for demos and performance
The planned kernel incorporates the latest FAST and
The planned kernel incorporates the latest FAST and
Web100 patches over a 2.6.17
Web100 patches over a 2.6.17-
-7 kernel and includes the
7 kernel and includes the
latest RAID and 10GE NIC drivers.
latest RAID and 10GE NIC drivers.
The UltraLight web page (
The UltraLight web page (http://www.ultralight.org
http://www.ultralight.org ) has a
) has a
Kernel page which provides the details off the
Kernel page which provides the details off the
Workgroup
Workgroup-
->Network page
>Network page
Optical Path Plans
Optical Path Plans
Emerging
Emerging “
“light path
light path”
” technologies are becoming
technologies are becoming
popular in the Grid community:
popular in the Grid community:
They can extend and augment existing grid computing
They can extend and augment existing grid computing
infrastructures, currently focused on CPU/storage, to
infrastructures, currently focused on CPU/storage, to
include the network as an integral Grid component.
include the network as an integral Grid component.
Those technologies seem to be the most effective way to
Those technologies seem to be the most effective way to
offer network resource provisioning on
offer network resource provisioning on-
-demand between
demand between
end
end-
-systems.
systems.
A major capability we are developing in Ultralight is the
A major capability we are developing in Ultralight is the
ability to dynamically switch optical paths across the
ability to dynamically switch optical paths across the
node, bypassing electronic equipment via a fiber cross
node, bypassing electronic equipment via a fiber cross
connect.
connect.
The ability to switch dynamically provides additional
The ability to switch dynamically provides additional
functionality and also models the more abstract case
functionality and also models the more abstract case
where switching is done between colors (ITU grid
where switching is done between colors (ITU grid
lambdas).
lambdas).
VINCI: A Multi-Agent System
¾
¾
VINCI and the underlying MonALISA framework use a system of
VINCI and the underlying MonALISA framework use a system of
autonomous agents to support a wide range of dynamic services
autonomous agents to support a wide range of dynamic services
¾
¾
Agents in the MonALISA servers self
Agents in the MonALISA servers self-
-organize and collaborate with
organize and collaborate with
each other to manage access to distributed resources, to make
each other to manage access to distributed resources, to make
effective decisions in planning workflow, to respond to problems
effective decisions in planning workflow, to respond to problems
that affect multiple sites, or to carry out other globally
that affect multiple sites, or to carry out other globally-
-distributed
distributed
tasks
tasks
¾
¾
Agents running on end
Agents running on end-
-users
users’
’
desktops or clusters detect and adapt
desktops or clusters detect and adapt
to their local environment so they can function properly. They
to their local environment so they can function properly. They
locate and receive real
locate and receive real-
-time information from a variety of MonALISA
time information from a variety of MonALISA
services, aggregate and present results to users, or feed inform
services, aggregate and present results to users, or feed information
ation
to higher level services
to higher level services
¾
¾
Agents with built
Agents with built-
-in
in
“
“intelligence
intelligence”
”
are required to engage in
are required to engage in
negotiations (for network resources, for example), and to make p
negotiations (for network resources, for example), and to make pro
ro-
-
active run
active run-
-time decisions, while responding to changes in the
time decisions, while responding to changes in the
environment
environment
The Main VINCI Services
Application
End User
Agent
Topology Discovery
GMPLS
MPLS
OS
SNMP
Scheduling ; Dynamic Path Allocation
Back to top
Control Path
Provisioning
Back to top
Failure
Detection
Back to top
Application
Back to top
End User
Agent
Back to top
Authentication, Authorization, Accounting
Back to top
Learning
Back to top
Prediction
Back to top
System
Evaluation &
Back to top
Optimization
Back to top
MONITORING
Agents to Create on Demand
Agents to Create on Demand
Back to top
an Optical Path or Tree
an Optical Path or Tree
Optical
Switch
Runs a ML Demon
Runs a ML Demon
>
>
ml_path
ml_path IP1 IP4
IP1 IP4
“
“copy file IP4
copy file IP4”
”
ML proxy services
ML proxy services
used in Agent Communication
used in Agent Communication
ML Demon
ML Demon
Control &
Control &
Monitor the
Monitor the
switch
switch
Optical
Switch
Optical
Switch
MonALISA
ML Agent
MonALISA
ML Agent
MonALISA
ML Agent
2
1
3
Discovery &
Secure Connection
4
The time to create a
Back to top
The time to create a
path on demand is
path on demand is
Back to top
less than 1s
less than 1s
independent of the
independent of the
Back to top
location and the
location and the
number of
number of
Back to top
connections
connections
DEMO: MonALISA and path
DEMO: MonALISA and path-
-building
building
An example of optical path building using MonALISA is
An example of optical path building using MonALISA is
shown at:
shown at: http://ultralight.caltech.edu/web
http://ultralight.caltech.edu/web-
-
site/gae/movies/ml_optical_path/ml_os.htm
site/gae/movies/ml_optical_path/ml_os.htm
One of the focus areas for UltraLight is being able to
One of the focus areas for UltraLight is being able to
dynamically construct point
dynamically construct point-
-to
to-
-point light
point light-
-paths
paths
where supported.
where supported.
We still have a pending proposal (
We still have a pending proposal (PLaNetS
PLaNetS) focused on
) focused on
creating a managed dynamic network
creating a managed dynamic network
infrastructure
infrastructure…
…
LISA, EVO and
LISA, EVO and Endhosts
Endhosts
Many of you are familiar with VRVS. Its successor is called EVO
Many of you are familiar with VRVS. Its successor is called EVO
(Enabling Virtual Organizations). It improves on VRVS in a
(Enabling Virtual Organizations). It improves on VRVS in a
number of ways:
number of ways:
Support for H.263 (capture and send your desktop as another
Support for H.263 (capture and send your desktop as another
video source for a conference)
video source for a conference)
IM like capability (presence/chat)
IM like capability (presence/chat)
Better device / OS / Language support
Better device / OS / Language support
Significantly improved reliability and scalability
Significantly improved reliability and scalability
Related to this last point is a the
Related to this last point is a the “
“merger
merger”
” of MonALISA and
of MonALISA and
VRVS in EVO.
VRVS in EVO.
Endhost
Endhost agents (LISA) are now an integral part of EVO.
agents (LISA) are now an integral part of EVO.
Endhost
Endhost agents monitor the user
agents monitor the user’
’s hosts and react to changing
s hosts and react to changing
conditions
conditions
Something like this is envisioned as a component of deploying
Something like this is envisioned as a component of deploying
a
a ‘
‘managed network
managed network’
’
Prototype testing of network agent this fall?
Prototype testing of network agent this fall?
FTS and UltraLight
FTS and UltraLight…
…
To date there has been little interaction between people working
To date there has been little interaction between people working on the
on the
network and those working on data transport for ATLAS (or LHC in
network and those working on data transport for ATLAS (or LHC in
general)
general)
There is a significant amount of work architecting, developing a
There is a significant amount of work architecting, developing and
nd
hardening the data management (and transport) for ATLAS
hardening the data management (and transport) for ATLAS…
…little
little
time (or understanding of possibilities) for the network.
time (or understanding of possibilities) for the network.
A dynamic managed network introduces new possibilities. Researc
A dynamic managed network introduces new possibilities. Research
h
efforts in networking need to be fed into the data transport
efforts in networking need to be fed into the data transport
architecting.
architecting.
UltraLight is planning to engage the FTS developers and try to d
UltraLight is planning to engage the FTS developers and try to determine
etermine
their understanding of (and plans for) the network.
their understanding of (and plans for) the network.
GOAL: Account for the network and improve robustness and
GOAL: Account for the network and improve robustness and
performance of data transport and the overall infrastructure.
performance of data transport and the overall infrastructure.
Aside: US LHCNet Status and Plans
Aside: US LHCNet Status and Plans
The following 7 slides (from Harvey Newman) provide some
The following 7 slides (from Harvey Newman) provide some
details about US LHCNet and its plans to support LHC scale
details about US LHCNet and its plans to support LHC scale
physics requirements.
physics requirements.
Details are provided for reference but I won
Details are provided for reference but I won’
’t cover them in my
t cover them in my
limited time.
limited time.
Next Generation LHCNet:
Next Generation LHCNet:
Add Optical Circuit
Add Optical Circuit-
-Oriented Services
Oriented Services
CERN-FNAL Primary EPL
CERN-FNAL Secondary EPL
Based on CIENA
Based on CIENA “
“Core Director
Core Director”
”
Optical Multiplexers
Optical Multiplexers
Highly reliable in production
Highly reliable in production
environments
environments
Robust fallback, at the optical layer
Robust fallback, at the optical layer
Circuit
Circuit-
-oriented services:
oriented services:
Guaranteed Bandwidth Ethernet
Guaranteed Bandwidth Ethernet
Private Line (EPL)
Private Line (EPL)
Sophisticated standards
Sophisticated standards-
-based
based
software:
software: VCAT/LCAS
VCAT/LCAS.
.
VCAT
VCAT logical channels: highly
logical channels: highly
granular bandwidth management
granular bandwidth management
LCAS
LCAS: dynamically adjust
: dynamically adjust
channels
channels
Highly scalable and cost effective,
Highly scalable and cost effective,
especially for many OC
especially for many OC-
-192 Links
192 Links
This is consistent with the directions of the other major R&E ne
This is consistent with the directions of the other major R&E networks such
tworks such
as Internet2/Abilene, GEANT (pan
as Internet2/Abilene, GEANT (pan-
-European), ESnet SDN
European), ESnet SDN
Next Generation LHCNet:
Next Generation LHCNet:
Add Optical Circuit
Add Optical Circuit-
-Oriented Services
Oriented Services
CERN-FNAL Primary EPL
CERN-FNAL Secondary EPL
Force10 switches for Layer
Force10 switches for Layer
3 services
3 services
Topology
Topology
JANUARY 2007
JANUARY 2007
BNL and Long Island MAN Ring
BNL and Long Island MAN Ring -
-
Feb., 2006 to 2008
Feb., 2006 to 2008
ESnet demarc
Cisco 6509
LI MAN – Diverse dual core connection
32 AoA, NYC
Brookhaven
National Lab, Upton, NY
10GE
circuit
10GE
circuit
BNL
IP gateway
Chicago
(Syracuse –
Cleveland –
Chicago)
ESnet
SDN
core
Washington
10 Gb/s circuits
International
USLHCnet circuits (proposed)
production IP core
SDN/provisioned virtual circuits
2007 circuits/facilities
other NYC site
ESnet SDN
core switch
LI MAN DWDM
MAN LAN
GEANT
CERN
Chicago
(
ESnet IP core
)
Europe
Washington
(
ESnet IP
core
)
•Abilene
•NYSERNet
•SINet (Japan)
•CANARIE (Canada)
•HEANet (Ireland)
•Qatar
DWDM ring
(KeySpan Communications)
SDN core DWDM
2007
or
2008
2007
2006
T320
Back to top
ESnet
IP core
Back to top
2007
or
2008
second
MAN
switch
USLHCnet
Chi
USLHCnet