draft-khasnabish-vmmi-problems-00:
Requirements for Mobility and Interconnection of Virtual Machine and Virtual Network Elements
Author(s): Bin Liu, Bhumip Khasnabish
In this draft, we discuss the issues and requirements related to migration, mobility, and interconnection of Virtual Machines (VMs)and Virtual Network Elements (VNEs). We also describe the limitations of various types of virtual local area networking (VLAN) and...
Network Working Group Bhumip Khasnabish
Internet-Draft ZTE USA, Inc.
Intended status: Informational Bin Liu
Expires: June 30, 2012 ZTE Corporation
December 28, 2011
Requirements for Mobility and Interconnection of Virtual Machine and
Virtual Network Elements
draft-khasnabish-vmmi-problems-00.txt
Abstract
In this draft, we discuss the issues and requirements related to
migration, mobility, and interconnection of Virtual Machines (VMs)and
Virtual Network Elements (VNEs). We also describe the limitations of
various types of virtual local area networking (VLAN) and virtual
private networking (VPN) techniques that are traditionally expected
to support such migration, mobility, and interconnection.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on June 30, 2012.
Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 1]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1. Conventions used in this document . . . . . . . . . . . . 4
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. General Requirements . . . . . . . . . . . . . . . . . . . . . 5
3.1. Data Center Maintenance . . . . . . . . . . . . . . . . . 6
3.2. Disaster Recovery . . . . . . . . . . . . . . . . . . . . 6
3.3. Data center migration or integration . . . . . . . . . . . 6
3.4. Expansion . . . . . . . . . . . . . . . . . . . . . . . . 6
3.5. Load Balancing . . . . . . . . . . . . . . . . . . . . . . 6
3.6. The safety problems in the environment of virtual
machine . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.7. The efficiency of data migration and migration fault
risk processing . . . . . . . . . . . . . . . . . . . . . 7
3.8. How to migrate in the IPV4 and IPV6 mixed environment . . 7
3.8.1. The real-time perception of global network and
storage resources . . . . . . . . . . . . . . . . . . 7
3.8.2. The real-time perception of global network
resources . . . . . . . . . . . . . . . . . . . . . . 7
3.8.3. The real-time perception of global storage
resources . . . . . . . . . . . . . . . . . . . . . . 8
3.9. General requirements of migration state . . . . . . . . . 8
3.9.1. Migration schedule foundation . . . . . . . . . . . . 8
3.9.2. Migration authentication . . . . . . . . . . . . . . . 8
3.9.3. Migration ability consultation . . . . . . . . . . . . 8
3.9.4. Standardization of migration state . . . . . . . . . . 8
3.10. The selection of migration . . . . . . . . . . . . . . . . 10
3.10.1. Two requirements with different network
environment and protocol . . . . . . . . . . . . . . . 10
3.10.2. Requirements for a wider range of virtual machine
live migration . . . . . . . . . . . . . . . . . . . . 10
3.11. The part of traffic roundabout analysis on the VM WAN
migration . . . . . . . . . . . . . . . . . . . . . . . . 11
3.11.1. VM migration requirements . . . . . . . . . . . . . . 11
3.11.2. Scene analysis . . . . . . . . . . . . . . . . . . . . 11
3.12. Robustness . . . . . . . . . . . . . . . . . . . . . . . . 11
3.12.1. Robustness of VM . . . . . . . . . . . . . . . . . . . 11
3.12.2. Robustness of VNE . . . . . . . . . . . . . . . . . . 12
3.13. Requirement on DCI interconnection fabric . . . . . . . . 12
3.14. Requirement of Cloud service Virtualization . . . . . . . 13
3.14.1. Requirement of logical element . . . . . . . . . . . . 13
3.14.2. Requirement of Resource Allocation Gateway function . 13
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 2]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
3.14.3. Requirement of specifications and performance . . . . 14
3.14.4. Requirement of fault tolerance capability . . . . . . 14
3.14.5. Network model . . . . . . . . . . . . . . . . . . . . 15
3.14.6. Types& Applications of VPNs interconnection
between DCs which provide cloud services . . . . . . . 15
3.14.6.1. Types of VPNs . . . . . . . . . . . . . . . . . . 16
3.14.6.2. Applications of L2VPN in DC . . . . . . . . . . . 16
3.14.6.3. Applications of L3VPN in DC . . . . . . . . . . . 17
3.14.7. VN Requirement . . . . . . . . . . . . . . . . . . . . 17
3.14.8. Requirement of Mobility . . . . . . . . . . . . . . . 17
3.14.9. Mobility Requirement . . . . . . . . . . . . . . . . . 18
3.14.9.1. Summarization of Mobility . . . . . . . . . . . . 18
3.14.9.2. Problem Statement . . . . . . . . . . . . . . . . 18
3.15. MAC, IP, ARP Explosion . . . . . . . . . . . . . . . . . . 19
3.16. Suppressing the flooding within VLAN . . . . . . . . . . . 19
3.17. Convergence and multipath support . . . . . . . . . . . . 19
3.18. Multicast processing . . . . . . . . . . . . . . . . . . . 19
3.19. Requirement of others . . . . . . . . . . . . . . . . . . 20
4. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5. Security Considerations . . . . . . . . . . . . . . . . . . . 21
6. IANA Consideration . . . . . . . . . . . . . . . . . . . . . . 21
7. Normative References . . . . . . . . . . . . . . . . . . . . . 21
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 3]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
1. Introduction
In this document, the requirements to migrate the virtual machines
are summarized, respectively, from the necessary conditions for
migration, business needs, state classification, security,
efficiency, and the migration program are described. Then list the
requirement for VM migration in the current IPV4 and IPV6 mixed
environment. On the choice of the migration solution, the
requirements of techniques either on large-scale Layer-2 network or
on the segmented IP network are discussed. The former is often used
for frequent data migration with high requirements on data security,
such as data migration and backup in the bank, etc.; the latter is
used for data migration for personal or mobile users, or between
different cloud service providers. Because there are more and more
pressing needs in the future to build the distributed PaaS across
different cloud service providers, a unified management platform will
offer very high efficiency for users as well as cloud service
providers. Secondly, summarize the requirement of virtual networks,
such as VM migration, visual network, DCI modes, etc; Finally, simple
statements are made for common ways of interconnection in IDC.
1.1. Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
2. Terminology
o DCB: Data Center Border Routers
o DC GW: Data Center Gateway
o DCS: Data Center Switch
o IP VPN: Layer 3 VPN, defined in L3VPN working group
o OTV: Overlay Transport Virtualization
o RA GW: Resource Allocation Gateway
o ToR: Top of the Rack
o VLAN: Virtual Local Area Networking
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 4]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
o VM: Virtual Machine
o VMMI: virtual machine mobility and interconnection
o VN: Virtual Network
o VNE: Virtual network element. For example, a virtual switch on a
server virtualization platform works as a virtual network element
o VPN: Virtual Private Network
o VPLS: Virtual Private LAN Service
3. General Requirements
Data center architecture consists of the following components:
Gateway (Data Center Gateway, Resource Allocation Gateway), Core
Router / Switch, (Aggregation layer switch), Access layer ToR switch,
Visual switch, and interconnecting network between DCs, Servers,
firewall system, etc.
Why VM and VNE need to be migrated?
VMS through migration management can save users money, maintenance
costs and upgrade costs. The volume of previous server is relatively
large; and the volume of the present server is relatively small. The
migration technology allows users to simultaneously use a single
server to replace various numbers of the previous servers, thus saves
the user a lot of room space. In addition, the server of virtual
machine has a unified "virtual hardware", unlike the previous server
which has a number of different hardware resources. After migration,
the server can be managed in a unified interface. Though using some
of the virtual machine software such as high availability tools
provided by VMware, when the server shutdown due to various failures,
it can automatically switch to another virtual server in the network
for non-disruptive operation. In short, migration has the advantage
of lowering operations costs, simplifying maintenance, improving
system load balancing, enhancing system error tolerance and
optimizing system power management.
Overall, VM migration faced with the following questions:
How to accommodate a large number of tenants in each isolated network
in the data center;
From one DC to another within one Admin domain, how to ensure the
necessary conditions in migration, how to ensure a successful
migration without disruption to services; how to ensure the success
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 5]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
of rollback when problems occur;
From one domain to another domain, how to solve the problem of
communication between domains;
In L2 network, VXLAN is used to resolve VLAN expanding problem&#
65294;IP based technology is used to resolve migration In L3 network
and VPN technology is used to carry L2 and L3 traffic across the IP/
MPLS core network.
3.1. Data Center Maintenance
The servers or applications in the data center architecture should
maintain uninterrupted service in the migration.
3.2. Disaster Recovery
In the face of some major natural disasters, business-critical data
center applications can be migrated to other data centers in advance.
3.3. Data center migration or integration
As a data center migration or integration needs, some of the
applications can be migrated without interruption from a data center
to another data center.
3.4. Expansion
As for the considerations of address resources, cooling and physical
space in the primary data center, some of the virtual machines can be
migrated to the backup data center.
3.5. Load Balancing
In the migration of virtual machines between data centers, users are
provided with the nearest calculation principle of "follow the sun",
or multi-site load balancing requirement. In addition, for reducing
energy, cooling costs and other considerations, the virtual machines
can be integrated into less dynamic data center, which is the future
trend of "green" data center.
3.6. The safety problems in the environment of virtual machine
Some problems, such as the operating environment of virtual machine,
how to prevent the invasion and how to detect in normal operation or
VM migration, are not concerned at present.
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 6]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
3.7. The efficiency of data migration and migration fault risk
processing
A lot of invalid data does not need to migrate. Consider to
streamline before the migration. Such as:
Incremental migration:
Only migrate the differentiated data between two parties.
Fault risk processing:
If migration between different, heterogeneous database system, such
as transfer data from ORACLE database in linux system to SQL Server
database in Windows system, it is necessary to define the security
and policy when fault happen: the processing would be very slowly
when database migration operation fail and need to roll back all the
database processing. It is necessary to define the database roll
back policy ,such as maximal waiting time between the different type
database.
3.8. How to migrate in the IPV4 and IPV6 mixed environment
A lot of issues need to be considered for migration in this mixed
environment, since IPV4 and IPV6 will coexist for a long time.
3.8.1. The real-time perception of global network and storage resources
In the current system, there exists such contradiction: network
resources and storage resources do not match, that is, the
availability of network system resources and the real-time
requirements of virtual machines/ storage system resources in the
data center are often inconsistent. From the global the scale, the
storage resources in the distributed data center system cannot be
used more efficiently. In turn, network resources cannot be used
more efficiently. Therefore, a management model needs to be
established, that can be informed of system-wide network resources
and storage resources, and dispatch them. The management model can
be integrated into the framework of virtual machine migration under
DMTF standardization.
But how do we learn the system-wide network resources and storage
resources? We need means and protocols to solve these problems.
3.8.2. The real-time perception of global network resources
One important reason that prevents us informed of system-wide
availability of network resources is, in the transition process of
IPV4 to IPV6 network, the different types of traditional tunneling
gateway cannot communicate with each other. In the transition
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 7]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
network using tunneling transition technique, the connections between
the subnets and the backbone network are achieved through the
tunneling gateway. The establishment of the tunnel varies with
different gateways. Tunneling communication can be established
between the same types of tunneling gateway. The tunnels between
different types of tunneling gateway cannot be established. Here
paper presents a multi-tunneling VPN gateway solution to resolve the
problem that tunnels cannot be established between heterogeneous
tunneling gateways.
3.8.3. The real-time perception of global storage resources
The access of data center virtual machine / storage system resources
can be done well, as long as the relevant API interface, format, and
communication protocols are defined.
These virtual machine / storage system resources in the global scope
are registered, and reported to the resource management system in the
cloud system.
Eventually, the resource management system in the cloud system is
informed of the whole network resources, which lays the foundation
for the determination of the follow-up ability matching.
3.9. General requirements of migration state
3.9.1. Migration schedule foundation
The manager should be able to confirm which data center currently
need to interconnect and migrate data in the whole network.
3.9.2. Migration authentication
Use authentication on parties of the network related with the
migration, storage resources authentication, firewall authentication.
3.9.3. Migration ability consultation
After passing the authentication, related parties need to determine
whether they have the ability to migrate, including network bandwidth
resources, storage resources, resource pool scheduling and so on.
3.9.4. Standardization of migration state
Related parties should be able to understand the state of each other:
Global detection -> authentication processing -> capability
negotiation->session establishment -> initialization instance ->
establish the beginning stage of life -> begin migration -> migration
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 8]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
& migration exception handling -> finish migration -> the end stage
of life -> destruction of instances - > Global detection
+------------------------+
| \|/
| +------------------+
| | Global detection |
| +------------------+
| |
| \|/
| +------------------+
| | authentication |
| | processing |
| +------------------+
| |
| \|/
| +------------------+
| | capability |
| | negotiation |
| +------------------+
| |
| \|/
| +------------------+
| | session |
| | establishment |
| +------------------+
| |
| \|/
| +------------------+
| | initialization |establish the beginning stage of life
| | instance | |
| +------------------+ |
| \| |
| +---------------| |
| | /| |
| | | |
| | \|/ |
| | +------------------+ |
| | | begin migration | |
| | +------------------+ |
| | | |
| | | |
| | \|/ |
| +------------+ |
| | exception |/ Y migration |
| | processing |--- exception? |
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 9]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
| +------------+\ |
| | |
| |N |
| | |
| \|/ |
| +------------------+ |
| | finish migration | |
| +------------------+ |
| | |
| \|/ |
| +------------------+ \|/
| | destruction | the end stage of life
| | of instances |
| +------------------+
| |
+------------------------+
Figure 1: migration state
3.10. The selection of migration
3.10.1. Two requirements with different network environment and
protocol
Currently the large-scale Layer--2 interconnect technique is mainly
used in the migration of virtual machine, but there are also the
needs for the Layer-3 migration. These two technologies apply for
different implementation scenario. The former is often used for
frequent data migration with high requirements on data security, such
as data migration and backup in the bank, etc.; the latter is used
for data migration between data centers without a VPN network, or
between different service providers. This demand really exists.
Because of user requirements on the establishment of a unified
management platform, it will become more and more important to build
the distributed PaaS across different cloud providers. No user is
willing to maintain too many independent platforms. At the same
time, sharing resources across the data center will become a trend.
As a result, it will become very cumbersome for data center managers
to build a large number of VPN connections for all data centers. So
there will be a portal operator, who take care of all t internal VPN
connections between the clouds and unify the scheduling of data
migration, or use the partitioning program.
3.10.2. Requirements for a wider range of virtual machine live
migration
a) Virtual machine live migration across IPv4 / IPv6 network; b)
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 10]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
Solutions applying to the evolution from IPv4 to IPv6; c) Virtual
machine live migration based on mobile IP. Virtual machines can also
be mobile. Need to achieve transparent live migration of virtual
machines cross-IPv4 / IPv6 network, and be able to apply to a variety
of application environments in the early, mid-and late evolution of
IPv4 to IPv6. Scenarios: This application is suitable for the mobile
applications of small users or home users. The complexity of the
network is full shielded from the user, as long as the source and
destination have an IPV6 address. It can be the application scenario
of virtual machine migration in the Layer-3 network environment. It
is suitable for individuals skeptical about the security of cloud.
3.11. The part of traffic roundabout analysis on the VM WAN migration
3.11.1. VM migration requirements
VM MAC / IP keep the same, migration in the L2 network.
3.11.2. Scene analysis
VM migrates from the IDC in metro A to metro B. There is almost no
traffic roundabout for users within the metro (such as user a). For
access to IDC business by WAN, such as metro C, the client traffic
must first access VM-A gateway after VM migration, and then sent to
the migrated VM through the Layer-2 tunnel.
3.12. Robustness
3.12.1. Robustness of VM
When a VM is running, it will inevitably encounter all kinds of
problems: CPU is overloaded, memory is not enough, disk space is not
enough, the program is not responding, database write fails, the file
system fails, etc. If any of the problems is not resolved, it will
lead to the collapse of the VM. The VM system should be able to
start emergency treatment: taking the snapshot of all data in the VM
and coping to a blank VM in another server, or coping to another
blank VM in current server, in order to achieve the purpose of
services provided by the original VM can continue running without any
disruption. In the process of taking snapshot, the snapshot can be
either stateful or stateless, depending on the status, nature, and
function of the owner which various data belongs to in the VM, and
the strategy of replication. For example, the data in the database
is taken a stateful snapshot, because the database itself has the
ability to record the database running state.
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 11]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
3.12.2. Robustness of VNE
When a VNE is running, it will also inevitably encounter all kinds of
problems: CPU is overloaded, memory is not enough, MAC table is full,
forwarding table is full, the program is not responding, etc. If any
of the problems is not resolved, it will lead to the collapse of the
VNE. The VNE system should be able to start emergency treatment:
taking the snapshot of all data in the VNE and coping to another
blank VNE in current server, in order to achieve the purpose of
services provided by the original VNE can continue running without
any disruption. In the process of taking snapshot, the snapshot can
be either stateful or stateless, depending on the status, nature, and
function of the owner which various data belongs to in the VNE, and
the strategy of replication. For example, the state of the protocol
is taken a stateless snapshot, and forwarding tables of protocol-
independent are taken a stateful snapshot, because the protocol is
running time-dependently, if the time is different, the state should
be different, so the state of the protocol does not have to copy all.
3.13. Requirement on DCI interconnection fabric
This fabric should be open and transparent, which can be achieved by
simple extension on some of the existing technology. The program
should have a strong openness and compatibility, easy to deploy.
The negative influence of ARP, MAC and IP entry explosion on the
individual network which contains a large number of tenants should be
minimized by DC and DC-interconnect technologies.
The link capacity of DC and DC-interconnect should be effectively
utilized.
Traffic should be forwarded on the shortest path between two VMs
within the DC or across DCs.
Efficient traffic forwarding requires effective utilization of the
link capacity of DC and DC-interconnect, and traffic forwarding on
the shortest path between two VMS within the DC or across DCs.
Inter-DC connectivity must be supported because of these factors:
Support east-west traffic between customer' applications located in
different DCs
Management across the DC
Mobility of VM migration across the DCs
Many mature VPN technologies can be used to provide connectivity
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 12]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
between DCs, or the extension of VLAN and virtual domain between DCs.
3.14. Requirement of Cloud service Virtualization
3.14.1. Requirement of logical element
Resource Allocation Gateway
Network service providers provide virtualized basic network resources
for tenants between tenants' sites and tenants' data center. In the
data center, data center provider provide Virtualization computing
and Virtualization storage resources. The gateway's role is to
allocate virtualized resources. These resources are divided into
three categories: network resources, computing resources and storage
resources. The gateway compares network resources with storage
resources, finds out the corresponding relations, and achieves
globally reasonable resource scheduling. DC GW's function is a
subset of RA GW functions.
Data Center Gateway
The gateway provides access to the data center for different users in
the outside world, including Internet and VPN connections. In the
current DC network model, DC GW may be a router with virtual routing
capabilities, or may be a PE of IPVPN/L2VPN. Core Nodes which
perform the DC GW role also play roles as Internet connectivity,
inter-DC connectivity and VPN support.
Core Router / Switch
High-end core nodes of switch with routing capabilities located in
the core layer, connecting aggregation layer switches.
Aggregation layer switches The switch aggregates traffic from the ToR
switches and forwards downstream traffic. The switch can be a normal
aggregation switch, or multiple switches virtualized into a single
stack switch.
Access layer ToR switch
ToRs are usually dual-homed to the parent node switch.
Visual switch
A virtual software switch which runs on blade servers.
3.14.2. Requirement of Resource Allocation Gateway function
Data center and network providers provide virtualized computing,
storage and network resources services. Tenants are identified by
the overlapping addresses, and share a pool of storage and network
resources. Therefore, a virtual platform is needed, including
virtual machines, virtual services, virtual storage and virtual
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 13]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
networks. What tenants see is a subset of the above four. The
virtual platform is built on the framework of the physical network,
physical servers, physical switches and routers, physical storage
devices. Through the virtual platform, users are achieved global
resource scheduling in the whole system.
According to the collected network-wide computing, storage and
network resource information, resource allocation gateway allocates
appropriate computing, storage and network resources for tenants
according to certain rules.
Resource allocation gateway needs redundant backup, in order to
prevent a single point of failure. The global resource information
and scheduling information between resource allocation gateway and
backup resource allocation gateway need real-time backup.
It can provide automatic match and scheduling of the virtualized
resources, which are dynamic and adjust according to operating
conditions. It can optimize utilization of the computing resources,
network resources (including IDC interconnection resources and IDC
internal routing & switching resources) and storage resources. It
should consider the optimization of the network path routing in
network matching. Routing selection is based on: the degree of
matching between the required bandwidth and the bandwidth which can
be provided, the shortest path, service level, user level. These
factors need to be set priorities in decision-making.
3.14.3. Requirement of specifications and performance
It should be able to support a large number of tenants sharing data
center resources, and need to support VLANs much larger than 4K. For
example, there are a number of VPN applications (VPLS or IP VPN)
which serve more than 10K customers, each requiring multiple VLANs,
then 4K VLANs is not far enough to use.
It should be able to guarantee high quality of service, and ensure a
large number of network connections are not interrupted. The
connectivity should meet carrier-level requirements.
3.14.4. Requirement of fault tolerance capability
In the event of an error, it is able to quickly recover from an error
condition. Error recovery includes network fault recovery, computing
power recovery, VM migration recovery and storage recovery. Among
them, the network fault recovery capability and computing power
recovery are the foundation of VM migration recovery and storage
recovery.
Network fault recovery: Once the virtual network connectivity sends
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 14]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
an error, it can automatically trigger alarms and handling, and can
rapidly enable a backup virtual network to recovery;
Computing ability recovery: Once the computing fails, the detection
mechanism is needed to find the problem and services can be scheduled
to backup virtual machines and virtual services;
VM migration recovery: In the event of migration failure, it is
needed to automatically restore to the original state of the virtual
machine and ensure that does not affect the users;
Storage recovery: In the event of storage failures, it is needed to
automatically find a backup virtual storage resource and enables it
immediately.
The above fault tolerances should be able to have enough short
response time and recovery speed, and minimize service delays and
malfunctions.
After the VM migration, it is needed to consider the impact on the
switching network, such as whether the new network environment will
have the problem of insufficient bandwidth. Although at the
consultation phase before the migration, there will be an initial
judgment, but it cannot guarantee no problem at all after the
migration; In addition, if the destination Data center needs to
activate the standby server and network resources, we should consider
extend the server resources and network resources.
In some cases, some routing policies need to be done on the network
and servers after migration.
3.14.5. Network model
DC has its own private network for the interconnection between the
data centers, or uses other network service provider's network
resources at the sametime.
DC service and WAN network services all are achieved by the same
company or organization;
DC service and WAN network services are achieved by two different
companies or organizations;
3.14.6. Types& Applications of VPNs interconnection between DCs which
provide cloud services
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 15]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
3.14.6.1. Types of VPNs
Layer3 VPN
BGP / MPLS IP Virtual Private Networks (VPNs) (BGP / MPLS IP Virtual
Private Networks (VPNs))
RFC 4364
Layer2 VPN
PBB + L2VPN
TRILL + L2VPN
VLAN + L2VPN
NVGRE (draft_sridharan_virtualization_nvgre)
PBB VPLS
E-VPN
PBB-EVPN
VPLS
VPWS
3.14.6.2. Applications of L2VPN in DC
L2 interconnecting across physical regions has become a standard way
of DC interconnection. VPN technology is used to carry L2 and L3
traffic across the IP/MPLS core network. This technology can also
used in the same DC to provide for scale expansion or interconnection
across L3 domains. VPLS has acted as an important role in IP / MPLS
WAN and supplied transparent LAN services. IP VPN, including BGP /
MPLS IP VPN and IPSec VPN, has been used in a common IP / MPLS core
network to provide virtual IP routing instances.
The implementation of PBB + L2VPN can take advantage of some of the
existing technology. It is flexible for VPN network in cloud
computing and can provide a sufficient number of VPN network
resources, which is much larger than the 4K VLAN mode of L2VPN and
the effect is similar with VXLAN. The use of PBB makes us not only
have access to more than 16M virtual LAN instances, but also can
separate the customer and provider domains by isolated MAC address
spaces.
Using PBB encapsulation, have a major advantage, VM's MAC address
will not be processed by ToRs and Core SWs, so Mac table size of ToRs
and Core SWs may be reduced two orders of magnitude, the specific
number is related with the number of virtual machines in each server
and VM virtual interfaces.
One solution to solve problems in DC is to deploy other technologies
in the existing DC network. A service provider can separate its
domains of VLAN into different VLAN islands, in this way each island
can support up to 4K VLANs. Domains of VLAN can be interconnected
via VPLS, at the same time, use DC GW as a VPLS PE.
If retaining the existing VLAN-based solutions only in VSw, while the
number of tenants in some VLAN islands is more than 4K, the service
provider need to deploy VPLS deeper in the DC network, that is, start
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 16]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
to provide support for L2VPN from the ToRs, and use the existing VPLS
solutions to enable MPLS for the ToR and DC core elements.
3.14.6.3. Applications of L3VPN in DC
IP VPN technology can also be used for data center virtualization, if
each tenant of requiring L3 virtualization is assigned a different IP
VPN instance, we can achieve in the DC network to provide multi-
tenant L3 virtualization support.
There are many advantages when using IP VPN as a L3 virtualization
solution within DC compared to existing virtual routing DC
technology, for example:
Supports many VRF-to-VRF tunneling options containing different
operational models: BGP/MPLS IP VPN, IP or L3 VPN GRE, etc.
The connection of IP VPN instances used in Cloud services below the
WAN can be IP VPN directly involved in the WAN, and so on.
3.14.7. VN Requirement
VN is composed by the virtual IDC network, virtual DC internal
switching network. These virtual networks are built on the basis of
the physical networks. VM migration is not affected by the physical
network. As long as it is within the scope of the VN, it is free to
migrate with the necessary conditions; In addition, network
architecture and forward switching capacity should match between the
source network and destination network, without concern for physical
network. The physical characteristics of the network, such as VLAN,
IP subnet, L2 protocol, QoS and so have been abstractly called the
logic concept of VN; Because VM operating environment is VN, like VM
has the associated logic concepts, such as the CPU process, I / O,
memory, Disk, etc., VN also has a corresponding set of logical
concepts.
VNs are isolated from each other. Its internal VMs communicate with
their own internal address, send and receive Ethernet packets.
VN do not have ties to their specific implementation, the
implementation can be Internet, L2VPN, L3VPN, GRE, etc. From the VN
layer, IP can be used to make that distinction. A firewall is needed
to go into the VN, ACL and other security policies are needed on the
access layer.
3.14.8. Requirement of Mobility
It is needed to consider the support for mobility, such as a VM can
be easily migrated repeatedly between a number of DC (greater than
2). It is needed to consider free migration program in the IPV4 and
IPV6 VPN environments. This part of the demand should be considered
in the processing business of DC GW.
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 17]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
VMs in the resource pool should have a good mobility. They can be
moved between different servers within the same DC or from the DC to
a remote DC. This has many advantages, for example: In the event of
a natural disaster, the VM is migrated to a safe location of DC in
order to reduce downtime. The VM can also be migrated to sites to
reduce costs, where the rent or the electricity fee is cheap. When
the VM is migrated to a new location, it should maintain the existing
client sessions. VM's MAC and IP address should be preserved and the
state of the VM sessions should be copied to the new location.
Some virtual machine migration tools now widely used require that
management program on the source server and destination server is
directly L2 connected. Its purpose is to facilitate the
implementation of VM migration, such as VMware's VMotion virtual
machine migration tool.
Firstly, A VMotion ELAN may need to provide protection and load-
balancing across multiple DC network. Secondly, in the current
VMotion procedure, the new location of the VM must be part of the
tenant ELAN domain. When a new VM is activated, a Gratuitous ARP is
sent, and the MAC FIB entries in the "tenant ELAN" are updated to
direct the traffic for that VM to the new VM location. Thirdly, if
the path needs IP forwarding, the accessibility information of VM
must be updated to the shortest path information to the VM.
3.14.9. Mobility Requirement
3.14.9.1. Summarization of Mobility
Mobility means to move a VM from one server to another server within
the DC or in a different DC, while maintaining the original VM's IP
and MAC address unchanged. VM mobility does not change the VLAN/
subnet connected VM, and requires VLAN be extended to the new
location of VM.
In summary, the seamless mobility solution in DC is based on IP
routing, BGP / MPLS MAC-VPN, BGP / MPLS IP VPNs and NHRP.
3.14.9.2. Problem Statement
The following statement discusses the problems faced in seamless VM
mobility.
The first problem is that the source server and destination server in
VM migration may be located in different data centers, which needs to
extend the layer-2 network. There are islands formed by the same
VLAN in different data centers.
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 18]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
The second problem is that the optimal forwarding in a VLAN for VM
mobility may involve multiple data centers.
The third problem is what the optimal intra-VLAN forwarding mode is.
The forth problem is what the optimal routing mode is.
3.15. MAC, IP, ARP Explosion
Each blade server supports 16-40 VMs, or even more, and each one has
its own MAC and IP addresses. Network devices in the data center
will encounter many problems for their conventional framework and
communication environment because of accommodating such a huge number
of IP, MAC addresses and ARP. Disk, memory, FDB table, MAC table,
convergence time will increase. In order to accommodate the mass of
the servers, it may affect the network topology, for example, is the
Fat tree topology or a conventional network topology of the network
device.
The amount of ARP packets will grows with not only the number of
virtual L2 domains or ELANs which is instantiated on server but also
the number of VMs in that domain. There will be questions such as:
overload of ARP entries on Server/Hypervisor, exhaustion of ARP
entries on the Routers/PEs and processing overload of any other L3
service appliances.
These problems will lead to flood explosion throughout layer 2
switching network.
3.16. Suppressing the flooding within VLAN
From the business perspective, DC operators should try to reduce the
flooding of broadcast, multicast and unknown unicast frames within
VLAN caused by the improper configuration.
3.17. Convergence and multipath support
STP is used to solve the broadcast storm problem in the loop, but it
also brings the problem of inefficient use of resources and network
oscillation. The solutions include switch virtualization, or the
imperfect TRILL and SPB.
3.18. Multicast processing
STP bridge is often used to perform IGMP and/or PIM snooping to
optimize multicast data delivery. However, this snooping is
performed by local STP topology. All traffic goes through the root
bridge for each bridge. This may lead to sub-optimal multicast
traffic transmission. In addition, each customer multicast group is
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 19]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
associated with the forwarding tree network throughout the Ethernet
switching network. Efficient Layer 2 multicast must be provided in
the solution.
3.19. Requirement of others
Packet encapsulation
It is needed a way similar to overlay address to implement VN.
Overlay address can be reflected by VXLAN or the I-SID of PBB +
L2VPN. The overlay address as an identifier corresponds to every VN
instance. The implementation model requires edge switch or router as
DC GW for the encapsulation and decapsulation of tunnel packets. The
various VNs within the DC rely on overlay address to distinguish and
separated from each other. Each VN also contain 4k VLAN for its
internal use. The data go into DC interconnection through DC GW and
are encapsulated through strip for transmission
Routing control
The processing mechanism of different types of packets such as
unicast, multicast and broadcast, the processing mechanism of ARP
packet, load-sharing mechanisms need to be further clarified.
Security and authentication
In the migration process, it is needed to consider security issues,
such as how to solve traffic roundabout issue, how to ensure that the
firewall functionality during the migration process will not be
weakened
Connectivity
Consider whether to join VRF real-time in unit of HOST, or to join
the VRF in unit of VM, and what elements are needed to constitute
this connectivity, such as bandwidth, quality of service and so on.
Computing resources
Consider what kind of format to reflect the statistical units model
of computing resources in the VM, VN environment.
Storage resources
Consider what kind of format to reflect the statistical units model
of storage resources in the VM, VN environment.
4. References
[PBB-VPLS] Balus, F. et al. "Extensions to VPLS PE model for
Provider
Backbone Bridging", draft-ietf-l2vpn-pbb-vpls-pe-model-
04.txt (work in progress), October 2011.
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 20]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
[VM-Mobility] Raggarwa, R. et al. "Data Center Mobility based on
BGP/MPLS, IP Routing and NHRP", draft-raggarwa-data-center-
mobility-01.txt (work in progress), September 2011.
[VPN Applicability] Nabil Bitar. "Cloud Networking: Framework and
VPN Applicability",
draft-bitar-datacenter-vpn-applicability-01.txt, October 2011
[VXLAN] M.Mahalingam. "VXLAN: A Framework for Overlaying Virtualized
Layer 2 Networks over Layer 3 Networks",
draft-mahalingam-dutt-dcops-vxlan-00.txt, August 26, 2011
[NVGRE] M. Sridharan. "NVGRE: Network Virtualization using Generic
Routing Encapsulation",
draft-sridharan-virtualization-nvgre-00.txt, September 2011
[NVO3] Thomas Narten. " NVO3: Network Virtualization", l2vpn-9.pdf,
November 2011
5. Security Considerations
--[Note] Will be added in future.
6. IANA Consideration
The extensions of this draft is baed on DC environment.
7. Normative References
[RFC 4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
Networks (VPNs)", RFC 4364, February 2006.
Authors' Addresses
Bhumip Khasnabish
ZTE USA, Inc.
55 Madison Avenue, Suite 160 Morristown, NJ 07960
USA
Phone: +001-781-752-8003
Email: vumip1@gmail.com
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 21]
Internet-Draft Mobility and Interconnection of VM & VNE December 2011
Bin Liu
ZTE Corporation
15F, ZTE Plaza, No.19 East Huayuan Road,Haidian District
Beijing 100191
P.R.China
Email: liu.bin21@zte.com.cn
Bhumip Khasnabish & Bin Liu Expires June 30, 2012 [Page 22]














