DECADE Architecture and Protocol
Author(s): Haibin Song, Dirk Kutscher, Richard Alimi, Kostas Pentikousis, Yang Yang, Akbar Rahman
Content Distribution Applications (e.g., P2P applications) are widely used on the Internet and make up a large portion of the traffic in many networks. One technique to improve the network efficiency of these applications is to introduce storage...
APPSAWG R. Alimi Internet-Draft Google Intended status: Informational A. Rahman Expires: August 18, 2013 InterDigital Communications, LLC D. Kutscher NEC Y. Yang Yale University H. Song K. Pentikousis Huawei February 14, 2013 DECADE Architecture and Protocol draft-alimi-protocol-00 Abstract Content Distribution Applications (e.g., P2P applications) are widely used on the Internet and make up a large portion of the traffic in many networks. One technique to improve the network efficiency of these applications is to introduce storage capabilities within the networks; this is the capability provided by a DECADE (DECoupled Application Data Enroute) compatible system. This document presents an architecture, discusses the underlying principles, and identifies key functionalities in the architecture for introducing a DECADE in- network storage system. In addition, some examples are given to illustrate these concepts. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC 2119]. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any Alimi, et al. Expires August 18, 2013 [Page 1] Internet-Draft DECADE Architecture February 2013 time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on August 18, 2013. Copyright Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Alimi, et al. Expires August 18, 2013 [Page 2] Internet-Draft DECADE Architecture February 2013 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Protocol Flow . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.2. An Example . . . . . . . . . . . . . . . . . . . . . . . . 5 4. Architectural Principles . . . . . . . . . . . . . . . . . . . 6 4.1. Decoupled Control/Metadata and Data Planes . . . . . . . . 6 4.2. Immutable Data Objects . . . . . . . . . . . . . . . . . . 7 4.3. Data Objects With Identifiers . . . . . . . . . . . . . . 8 4.4. Explicit Control . . . . . . . . . . . . . . . . . . . . . 9 4.5. Resource and Data Access Control through Delegation . . . 10 5. System Components . . . . . . . . . . . . . . . . . . . . . . 11 5.1. Content Distribution Application . . . . . . . . . . . . . 11 5.2. DECADE Server . . . . . . . . . . . . . . . . . . . . . . 13 5.3. Data Sequencing and Naming . . . . . . . . . . . . . . . . 15 5.4. Token-based Authorization and Resource Control . . . . . . 16 5.5. Discovery . . . . . . . . . . . . . . . . . . . . . . . . 17 6. DECADE Protocols . . . . . . . . . . . . . . . . . . . . . . . 18 6.1. DECADE Naming . . . . . . . . . . . . . . . . . . . . . . 18 6.2. DECADE Resource Protocol (DRP) . . . . . . . . . . . . . . 19 6.3. Standard Data Transfer (SDT) Protocol . . . . . . . . . . 23 6.4. Server-to-Server Protocols . . . . . . . . . . . . . . . . 24 7. Security Considerations . . . . . . . . . . . . . . . . . . . 25 7.1. Threat: System Denial of Service Attacks . . . . . . . . . 25 7.2. Threat: Protocol Security . . . . . . . . . . . . . . . . 26 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 27 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 28 10.1. Normative References . . . . . . . . . . . . . . . . . . . 28 10.2. Informative References . . . . . . . . . . . . . . . . . . 28 Appendix A. In-Network Storage Components Mapped to DECADE Architecture . . . . . . . . . . . . . . . . . . . . 29 A.1. Data Access Interface . . . . . . . . . . . . . . . . . . 29 A.2. Data Management Operations . . . . . . . . . . . . . . . . 29 A.3. Data Search Capability . . . . . . . . . . . . . . . . . . 29 A.4. Access Control Authorization . . . . . . . . . . . . . . . 29 A.5. Resource Control Interface . . . . . . . . . . . . . . . . 30 A.6. Discovery Mechanism . . . . . . . . . . . . . . . . . . . 30 A.7. Storage Mode . . . . . . . . . . . . . . . . . . . . . . . 30 Appendix B. Hisotry . . . . . . . . . . . . . . . . . . . . . . . 30 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 30 Alimi, et al. Expires August 18, 2013 [Page 3] Internet-Draft DECADE Architecture February 2013 1. Introduction Content Distribution Applications, such as Peer-to-Peer (P2P) applications, are widely used on the Internet to distribute data, and they contribute a large portion of the traffic in many networks. The architecture described in this document enables such applications to leverage in-network storage to achieve more efficient content distribution (i.e. DECADE system). Specifically, in many subscriber networks, it can be expensive to upgrade network equipment in the "last-mile", because it can involve replacing equipment and upgrading wiring at individual homes, businesses, and devices such as DSLAMs (Digital Subscriber Line Access Multiplexers) and CMTSs (Cable Modem Termination Systems) in remote locations. Therefore, it may be cheaper to upgrade the core infrastructure, which involves fewer components that are shared by many subscribers. See [RFC6646] for a more complete discussion of the problem domain and general discussions of the capabilities to be provided by a DECADE system. This document presents an architecture for providing in-network storage that can be integrated into Content Distribution Applications. The primary focus is P2P-based content distribution, but the architecture may be useful to other applications with similar characteristics and requirements. See [I-D.ietf-decade-reqs] for a definition of the target applications as well as the requirements for a DECADE system. The approach of this document is to define the core functionalities and protocol functions that are needed to support a DECADE system. The specific protocols are not selected or designed in this document. Some illustrative examples are given to help the reader understand certain concepts. These examples are purely informational and are not meant to constrain future protocol design or selection. 2. Terminology This document assumes readers are familiar with the terms and concepts that are used in [RFC6646] and [I-D.ietf-decade-reqs]. 3. Protocol Flow 3.1. Overview Following[I-D.ietf-decade-reqs], the architecture consists of two protocols: the DECADE Resource Protocol (DRP) that is responsible for communication of access control and resource scheduling policies from a client to a server, as well as between servers; and Standard Data Alimi, et al. Expires August 18, 2013 [Page 4] Internet-Draft DECADE Architecture February 2013 Transfer (SDT) protocol(s) that will be used to transfer data objects to and from a server. We show the protocol components figure below: Native Application .-------------. Protocol(s) .-------------. | Application | <------------------> | Application | | End-Point | | End-Point | | | | | | .--------. | | .--------. | | | DECADE | | | | DECADE | | | | Client | | | | Client | | | `--------' | | `--------' | `-------------' `-------------' | ^ | ^ DECADE | | Standard | | Resource | | Data DRP | | SDT Protocol | | Transfer | | (DRP) | | (SDT) | | | | | | | | | | | | | | | | | | | | | | | | | | v V v V .=============. DRP .=============. | DECADE | <------------------> | DECADE | | Server | <------------------> | Server | `=============' SDT `=============' Figure 1: Generic Protocol Flow 3.2. An Example This section provides an example showing the steps in the architecture for a data transfer scenario involving an in-network storage system. We assume that Application End-Point B (the receiver) is requesting a data object from Application End-Point A (the sender). Let S(A) denote the DECADE storage server to which A has access. There are multiple usage scenarios (by choice of the Content Distribution Application). For simplicity of introduction, we design this example to use only a single DECADE server. The steps of the example are illustrated in Figure 2. First, B requests a data object from A using their native application protocol (see Section 5.1.2). Next, A uses the DRP to obtain a token. There are multiple ways for A to obtain the token: compute locally, or request from its DECADE storage server, S(A). See Section 6.2.2 for Alimi, et al. Expires August 18, 2013 [Page 5] Internet-Draft DECADE Architecture February 2013 details. A then provides the token to B (again, using their native application protocol). Finally, B provides the token to S(A) via DRP, and requests and downloads the data object via a SDT. .----------. 2. Obtain --------> | S(A) | <------ Token / `----------' \ 4. Request and (DRP) / \ Download Data Locally / \ Object or From / \ (DRP + SDT) S(A) v 1. App Request v .-------------. <--------------------------- .-------------. | Application | | Application | | End-Point A | | End-Point B | `-------------' ---------------------------> `-------------' 3. App Response (token) Figure 2: Download from Storage Server 4. Architectural Principles We identify the following key principles that will be followed in any DECADE system: 4.1. Decoupled Control/Metadata and Data Planes A DECADE system SHOULD be able to support multiple Content Distribution Applications. A complete Content Distribution Application implements a set of "control plane" functions including content search, indexing and collection, access control, replication, request routing, and QoS scheduling. Different Content Distribution Applications will have unique considerations designing the control plane functions: o Metadata Management Scheme: Traditional file systems provide a standard metadata abstraction: a recursive structure of directories to offer namespace management; each file is an opaque byte stream. Content Distribution Applications may use different metadata management schemes. For example, one application might use a sequence of blocks (e.g., for file sharing), while another application might use a sequence of frames (with different sizes) indexed by time. o Resource Scheduling Algorithms: A major advantage of many successful P2P systems is their substantial expertise in achieving highly efficient utilization of peer and infrastructural Alimi, et al. Expires August 18, 2013 [Page 6] Internet-Draft DECADE Architecture February 2013 resources. For instance, many streaming P2P systems have their specific algorithms in constructing topologies to achieve low- latency, high-bandwidth streaming. They continuously fine-tune such algorithms. Given the diversity of control plane functions, a DECADE system SHOULD allow as much flexibility as possible to the control plane to implement specific policies. This conforms to the end-to-end systems principle and allows innovation and satisfaction of specific performance goals. Decoupling control plane and data plane is not new. For example, OpenFlow [OpenFlow] is an implementation of this principle for Internet routing, where the computation of the forwarding table and the application of the forwarding table are separated. Google File System [GoogleFileSystem] applies the principle to file system design, by utilizing the Master to handle the meta-data management, and the Chunk servers to handle the data plane functions (i.e., read and write of chunks of data). NFSv4.1's pNFS extension [RFC5661] also implements this principle. 4.2. Immutable Data Objects A property of bulk content to be broadly distributed is that they typically are immutable -- once content is generated, it is typically not modified. It is not common that bulk content such as video frames and images need to be modified after distribution. Focusing on immutable data in the data plane can substantially simplify the data plane design, since consistency requirements can be relaxed. It also simplifies reuse of data and implementation of de- duplication. Depending on its specific requirements, an application may store immutable data objects in DECADE servers such that each data object is completely self-contained (e.g., a complete, independently decodable video segment). An application may also divide data into data objects that require application level assembly. Many Content Distribution Applications divide bulk content into data objects for multiple reasons, including (1) fetching different data objects from different sources in parallel; and (2) faster recovery and verification: individual data objects might be recovered and verified. Typically, applications use a data object size larger than a single packet in order to reduce control overhead. A DECADE system SHOULD be agnostic to the nature of the data objects and SHOULD NOT specify a fixed size for them. A protocol specification based on this architecture MAY prescribe requirements Alimi, et al. Expires August 18, 2013 [Page 7] Internet-Draft DECADE Architecture February 2013 on minimum and maximum sizes by compliant implementations. Immutable data objects can still be deleted. Applications may support modification of existing data stored at a DECADE server through a combination of storing new data objects and deleting existing data objects. For example, a meta-data management function of the control plane might associate a name with a sequence of immutable data objects. If one of the data objects is modified, the meta-data management function changes the mapping of the name to a new sequence of immutable data objects. Throughout this document, all data objects are assumed to be immutable. 4.3. Data Objects With Identifiers An object that is stored in a DECADE storage server SHALL be accessed by Content Consumers via a data object identifier. A Content Consumer may be able to access more than one storage server. A data object that is replicated across different storage servers managed by a DECADE Storage Provider MAY still be accessed by a single identifier. Since data objects are immutable, it SHALL be possible to support persistent identifiers for data objects. Data object identifiers for data objects SHOULD be created by Content Providers that upload the objects to servers. We refer to a scheme for the assignment/derivation of the data object identifier to a data object depends as the data object naming scheme. The details of data naming schemes will be provided by future DECADE protocol/naming specifications. This document describes naming schemes on a semantic level and specific SDTs and DRPs SHOULD use specific representations. In particular, for some applications it is important that clients and servers SHOULD be able to validate the name-object binding for a data object, i.e., by verifying that a received object really corresponds to the name (identifier) that was used for requesting it (or that was provided by a sender). Data object identifiers can support name- object binding validation by providing message digests or so-called self-certifying naming information -- if a specific application has this requirement. A DECADE naming scheme follows the following general requirements: o Different name-object binding validation mechanisms MAY be supported; Alimi, et al. Expires August 18, 2013 [Page 8] Internet-Draft DECADE Architecture February 2013 o Content Distribution Applications will decide what mechanism to use, or to not provide name-object validation (e.g., if authenticity and integrity can by ascertained by alternative means); o Applications MAY be able to construct unique names (with high probability) without requiring a registry or other forms of coordination; and o Names MAY be self-describing so that a receiving entity (Content Consumer) knows what hash function (for example) to use for validating name-object binding. Some Content Distribution Applications will derive the name of a data object from the hash over the data object, which is made possible by the fact that DECADE objects are immutable. But there may be other applications such as live streaming where object names will not based on hashes but rather on an enumeration scheme. The naming scheme will also enable those applications to construct unique names. In order to enable the uniqueness, flexibility and self-describing properties, the naming scheme SHOULD provide the following name elements: o A "type" field that indicates the name-object validation function type (for example, "sha-256"); o Cryptographic data (such as an object hash) that corresponds to the type information; and The naming scheme MAY additionally provide the following name elements: o Application or publisher information. The specific format of the name (e.g., encoding, hash algorithms, etc) is out of scope of this document, and is left for protocol specification. 4.4. Explicit Control To support the functions of an application's control plane, applications SHOULD be able to know and coordinate which data is stored at particular servers. Thus, in contrast with traditional caches, applications are given explicit control over the placement (selection of a DECADE server), deletion (or expiration policy), and access control for stored data. Alimi, et al. Expires August 18, 2013 [Page 9] Internet-Draft DECADE Architecture February 2013 Consider deletion/expiration policy as a simple example. An application might require that a server stores data objects for a relatively short period of time (e.g., for live-streaming data). Another application might need to store data objects for a longer duration (e.g., for video-on-demand). 4.5. Resource and Data Access Control through Delegation A DECADE system will provide a shared infrastructure to be used by multiple Content Consumers and Content Providers spanning multiple Content Distribution Applications. Thus, it needs to provide both resource and data access control. 4.5.1. Resource Allocation There are two primary interacting entities in a DECADE system. First, Storage Providers SHOULD coordinate where storage servers are provisioned and their total available resources Section 6.2.1. Second, Applications will coordinate data transfers amongst available servers and between servers and clients. A form of isolation is required to enable concurrently-running Applications to each explicitly manage its own data objects and share of resources at the available servers. The Storage Provider SHOULD delegate the management of the resources on a server to Content Providers. This means that Content Providers are able to explicitly and independently manage their own shares of resources on a server. 4.5.2. User Delegations Storage Providers will have the ability to explicitly manage the entities allowed to utilize the resources at a DECADE server. This capability is needed for reasons such as capacity-planning and legal considerations in certain deployment scenarios. The server SHOULD grant a share of the resources to a Content Provider or Content Consumer. The client can in turn share the granted resources amongst its multiple applications. The share of resources granted by a server is called a User Delegation. As a simple example, a DECADE server operated by an ISP might be configured to grant each ISP Subscriber 1.5 Mbit/s of bandwidth. The ISP Subscriber might in turn divide this share of resources amongst a video streaming application and file-sharing application which are running concurrently. Alimi, et al. Expires August 18, 2013 [Page 10] Internet-Draft DECADE Architecture February 2013 5. System Components The primary focus of this document is the architectural principles and the system components that implement them. While certain system components might differ amongst implementations, the document details the major components and their overall roles in the architecture. To keep the scope narrow, we only discuss the primary components related to protocol development. Particular deployments will require additional components (e.g., monitoring and accounting at a server), but they are intentionally omitted from this document. 5.1. Content Distribution Application Content Distribution Applications have many functional components. For example, many P2P applications have components and algorithms to manage overlay topology management, rate allocation, piece selection, etc. In this document, we focus on the components directly employed to support a DECADE system. Figure 3 illustrates the components discussed in this section from the perspective of a single Application End-Point. Alimi, et al. Expires August 18, 2013 [Page 11] Internet-Draft DECADE Architecture February 2013 Native Protocol(s) (with other Application End-Points) .---------------------> | | .----------------------------------------------------------. | Application End-Point | | .------------. .-------------------. | | | App-Layer | ... | App Data Assembly | | | | Algorithms | | Sequencing | | | `------------' `-------------------' | | | | .------------------------------------------------------. | | | DECADE Client | | | | | | | | .-------------------------. .----------------------. | | | | | Resource Controller | | Data Controller | | | | | | .--------. .----------. | | .--------. .-------. | | | | | | | Data | | Resource | | | | Data | | Data | | | | | | | | Access | | Sharing | | | | Sched. | | Index | | | | | | | | Policy | | Policy | | | | | | | | | | | | | '--------' `----------' | | `--------' `-------' | | | | | `-------------------------' `----------------------' | | | | | ^ | | | `------------ | ----------------- | -------------------' | `-------------- | ----------------- | ---------------------' | | | DECADE | Standard | Resource | Data | Protocol | Transfer | (DRP) | (SDT) v V Figure 3: Application Components 5.1.1. Data Assembly A DECADE system is geared towards supporting applications that can distribute content using data objects. To accomplish this, applications can include a component responsible for creating the individual data objects before distribution and then re-assembling data objects at the Content Consumer. We call this component the Application Data Assembly. In producing and assembling the data objects, two important considerations are sequencing and naming. A DECADE system assumes that applications implement this functionality themselves. See Section 6.1 for further discussion. Alimi, et al. Expires August 18, 2013 [Page 12] Internet-Draft DECADE Architecture February 2013 5.1.2. Native Application Protocols In addition to the DECADE DRP/SDT, applications can also support existing native application protocols (e.g., P2P control and data transfer protocols). 5.1.3. DECADE Client The client provides the local support to an application, and can be implemented standalone, embedded into the application, or integrated in other entities such as network devices themselves. 188.8.131.52. Resource Controller Applications may have different Resource Sharing Policies and Data Access Policies to control their resource and data in DECADE servers. These policies may be existing policies of applications or custom policies. The specific implementation is decided by the application. 184.108.40.206. Data Controller A DECADE system decouples the control and the data transfer of applications. A Data Scheduling component schedules data transfers according to network conditions, available servers, and/or available server resources. The Data Index indicates data available at remote servers. The Data Index (or a subset of it) can be advertised to other clients. A common use case for this is to provide the ability to locate data amongst distributed Application End-Points (i.e., a data search mechanism such as a Distributed Hash Table). 5.2. DECADE Server Figure 4 illustrates the components discussed in a DECADE server. A server is not necessarily a single physical machine, it can also be implemented as a cluster of machines. Alimi, et al. Expires August 18, 2013 [Page 13] Internet-Draft DECADE Architecture February 2013 | | | DECADE | Standard | Resource | Data | Protocol | Transfer | (DRP) | (SDT) | | .= | ================= | ======================. | | v | | | .----------------. | | |----> | Access Control | <--------. | | | `----------------' | | | | ^ | | | | | | | | | v | | | | .---------------------. | | | `-> | Resource Scheduling | <------| | | `---------------------' | | | ^ | | | | | | | v .------------. | | .-----------------. | User | | | | Data Store | | Delegation | | | `-----------------' | Management | | | DECADE Server `------------' | `==============================================' Figure 4: DECADE Server Components 5.2.1. Access Control A client SHALL be able to access its own data or other client's data (provided sufficient authorization) in DECADE servers. Clients MAY also authorize other clients to store data. If an access is authorized by a client, the server SHOULD provide access. Even if a request is authorized, it MAY still fail to complete due to insufficient resources at the server. 5.2.2. Resource Scheduling Applications will apply resource sharing policies or use a custom policy. Servers perform resource scheduling according to the resource sharing policies indicated by clients as well as configured User Delegations. 5.2.3. Data Store Data from applications will be stored at a DECADE server. Data may be deleted from storage either explicitly or automatically (e.g., Alimi, et al. Expires August 18, 2013 [Page 14] Internet-Draft DECADE Architecture February 2013 after a TTL expiration). 5.3. Data Sequencing and Naming The DECADE naming scheme implies no sequencing or grouping of objects, even if this is done at the application layer. 5.3.1. Application Usage Example To illustrate these properties, this section presents multiple examples. 220.127.116.11. Application with Fixed-Size Chunks Similar to the example in Section 5.1.1, consider an Application in which each individual application-layer segment of data is called a "chunk" and has a name of the form: "CONTENT_ID:SEQUENCE_NUMBER". Furthermore, assume that the application's native protocol uses chunks of size 16 KiB. Now, assume that this application wishes to store data in DECADE servers in data objects of size 64 KiB. To accomplish this, it can map a sequence of 4 chunks into a single data object, as shown in Figure 5. Application Chunks .---------.---------.---------.---------.---------.---------.-------- | | | | | | | | Chunk_0 | Chunk_1 | Chunk_2 | Chunk_3 | Chunk_4 | Chunk_5 | Chunk_6 | | | | | | | `---------`---------`---------`---------`---------`---------`-------- DECADE Data Objects .---------------------------------------.---------------------------- | | | Object_0 | Object_1 | | `---------------------------------------`---------------------------- Figure 5: Mapping Application Chunks to DECADE Data Objects In this example, the Application maintains a logical mapping that is able to determine the name of a DECADE data object given the chunks contained within that data object. The name may be learned from either the original Content Provider, another End-Point with which the Application is communicating, etc. As long as the data contained within each sequence of chunks is globally unique, the corresponding Alimi, et al. Expires August 18, 2013 [Page 15] Internet-Draft DECADE Architecture February 2013 data objects have globally unique names. 18.104.22.168. Application with Continuous Streaming Data Consider an Application whose native protocol retrieves a continuous data stream (e.g., an MPEG2 stream) instead of downloading and redistributing chunks of data. Such an application could segment the continuous data stream to produce either fixed-sized or variable- sized data objects. Figure 6 shows how a video streaming application might produce variable-sized data objects such that each data object contains 10 seconds of video data. Application's Video Stream .-------------------------------------------------------------------- | | | `-------------------------------------------------------------------- ^ ^ ^ ^ ^ | | | | | 0 Seconds 10 Seconds 20 Seconds 30 Seconds 40 Seconds 0 B 400 KiB 900 KiB 1200 KiB 1500 KiB DECADE Data Objects .--------------.--------------.--------------.--------------.-------- | | | | | | Object_0 | Object_1 | Object_2 | Object_3 | | (400 KiB) | (500 KiB) | (300 KiB) | (300 KiB) | `--------------`--------------`--------------`--------------`-------- Figure 6: Mapping a Continuous Data Stream to DECADE Data Objects Similar to the previous example, the Application might maintain a mapping that is able to determine the name of a data object given the time offset of the video chunk. 5.4. Token-based Authorization and Resource Control A key feature of a DECADE system is that an application endpoint can authorize other application endpoint to store or retrieve data objects from the in-network storage. An OAuth version 2 [RFC6749]based authorization scheme is used to accomplish this. A separate OAuth flow is used for this purpose, Alimi, et al. Expires August 18, 2013 [Page 16] Internet-Draft DECADE Architecture February 2013 a client authenticates (optional and out of the scope of this document) with the application server or the P2P application peer, and request the trusted by the client, and the token contains particular self contained properties (see Section 6.2.2 for details). The client then use the token when sending requests to the DECADE server. Upon receiving a token, the server validates the signature and the operation being performed. This is a simple scheme, but has some important advantages over an alternative approach in which a client explicitly manipulates an Access Control List (ACL) associated with each data object. In particular, it has the following advantages when applied to DECADE target applications: o Authorization policies are implemented within the Application; an Application explicitly controls when tokens are generated and to whom they are distributed and for how long they will be valid. o Fine-grained access and resource control can be applied to data objects; see Section 6.2.2 for the list of restrictions that can be enforced with a token. o There is no messaging between a client and server to manipulate data object permissions. This can simplify, in particular, Applications which share data objects with many dynamic peers and need to frequently adjust access control policies attached to data objects. o Tokens can provide anonymous access, in which a server does not need to know the identity of each client that accesses it. This enables a client to send tokens to clients belonging to other Storage Providers, and allow them to read or write data objects from the storage of its own Storage Provider. In addition to clients applying access control policies to data objects, the server MAY be configured to apply additional policies based on user, object, geographic location, etc. A client might thus be denied access even though it possesses a valid token. There are existing protocols (e.g., OAuth [RFC5849]) that implement similar referral mechanisms using tokens. A protocol specification of this architecture SHOULD endeavor to use existing mechanisms wherever possible. 5.5. Discovery A DECADE system SHOULD include a discovery mechanism through which clients locate an appropriate server. [I-D.ietf-decade-reqs] details Alimi, et al. Expires August 18, 2013 [Page 17] Internet-Draft DECADE Architecture February 2013 specific requirements of the discovery mechanism; this section discusses how they relate to other principles outlined in this document. A discovery mechanism SHOULD allow a client to determine an IP address or some other identifier that can be resolved to locate the server for which the client will be authorized to generate tokens (via DRP). (The discovery mechanism might also result in an error if no such servers can be located.) After discovering one or more servers, a client can distribute load and requests across them (subject to resource limitations and policies of the servers themselves) according to the policies of the Application End-Point in which it is embedded. The particular protocol used for discovery is out of scope of this document, but any specification SHOULD re-use standard protocols wherever possible. The discovery mechanism outlined here does not provide the ability to locate arbitrary DECADE servers to which a client might obtain tokens from others. To do so will require application-level knowledge, and it is assumed that this functionality is implemented in the Content Distribution Application. 6. DECADE Protocols This section presents the DRP and the SDT protocol in terms of abstract protocol interactions that are intended to be mapped to specific protocols. In general, the DRP/SDT functionality between a DECADE client-server are very similar to the DRP/SDT functionality between server-server. Any differences are highlighted below. DRP will be the protocol used by a DECADE client to configure the resources and authorization used to satisfy requests (reading, writing, and management operations concerning data objects) at a server. SDT will be used to transport data between a client and a server. 6.1. DECADE Naming A DECADE system SHOULD use the [I-D.farrell-decade-ni] as the recommended and default naming scheme. Other naming schemes that meet the guidelines in Section 4.3 may alternatively be used. In order to provide a simple and generic interface, the DECADE server will be responsible only for storing and retrieving individual data objects. Alimi, et al. Expires August 18, 2013 [Page 18] Internet-Draft DECADE Architecture February 2013 The DECADE naming format SHOULD NOT attempt to replace any naming or sequencing of data objects already performed by an Application; instead, the naming is intended to apply only to data objects referenced by DECADE-specific purposes. An Application using a DECADE client may use a naming and sequencing scheme independent of DECADE names. The DECADE client SHOULD maintain a mapping from its own data objects and their names to the DECADE-specific data objects and names. Furthermore, the DECADE naming scheme implies no sequencing or grouping of objects, even if this is done at the application layer. 6.2. DECADE Resource Protocol (DRP) DRP will provide configuration of access control and resource sharing policies on DECADE servers. A Content Distribution Application, e.g., a live P2P streaming session, can have permission to manage data at several servers, for instance, servers belonging to different Storage Providers, and DRP allows one instance of such an application, e.g., an Application End-Point, to apply access control and resource sharing policies on each of them. 6.2.1. Controlled Resources On a single DECADE server, the following resources SHOULD be managed: o Communication resources in terms of bandwidth (upload/download) and also in terms of number of active clients (simultaneous connections). o Storage resources. 6.2.2. Access and Resource Control Token As in DECADE system, the resource owner agent is always the same entity or colocated with the authorization server, so we use a separate OAuth 2.0 request and response flow for the access and resource control token. An OAuth request to access the data objects MUST include the following fields (encoding format is TBD, HTML?): response_type: REQUIRED. Value MUST be set to "token". client_id: the client_id indicates either the application that is using the DECADE service or the end user who is using the DECADE service from a DECADE storage service provider. DECADE storage service providers MUST provide the ID distribution and management Alimi, et al. Expires August 18, 2013 [Page 19] Internet-Draft DECADE Architecture February 2013 function, which is out of the scope of this document. scope: data object names that are requested. An OAuth response includes the following information (encoding is TBD, HTML is preferred, are we going to use OAuth Bearer token type as defined in RFC 6750? The concern for bearer token is that it does not associate the token with any client, so that any client can use this token to access the resources. Do we worry about it? The current draft seems explicitly support this behavior.): o token_type: "Bearer"? o expires_in: The lifetime in seconds of the access token. o access_token: a token denotes the following information. o service URI: the server address or URI which is providing the service; o Permitted operations (e.g., read, write); o Permitted objects (e.g., names of data objects that might be read or written); o Priority: optional. If it is presented, value MUST be set to be either "Urgent", "High", "Normal" or "Low". o Bandwidth: bandwidth that is given to requested operation, a weight value used in a weighted bandwidth sharing scheme, or a integer in number of bps; o Amount: data size in number of bytes that might be read or written. o token_signature: the signature of the access token. The tokens SHOULD be generated by an entity trusted by both the DECADE client and server at the request of a DECADE client. For example this entity could be the client, a server trusted by the client, or another server managed by a Storage Provider and trusted by the client. It is important for a server to trust the entity generating the tokens since each token may incur a resource cost on the server when used. Likewise, it is important for a client to trust the entity generating the tokens since the tokens grant access to the data stored at the server. Upon generating a token, a client MAY distribute it to another client Alimi, et al. Expires August 18, 2013 [Page 20] Internet-Draft DECADE Architecture February 2013 (e.g., via their native application protocol). The receiving client MAY then connect to the server specified in the token and perform any operation permitted by the token. The token SHOULD be sent along with the operation. The server SHOULD validate the token to identify the client that issued it and whether the requested operation is permitted by the contents of the token. If the token is successfully validated, the server SHOULD apply the resource control policies indicated in the token while performing the operation. Tokens SHOULD include a unique identifier to allow a server to detect when a token is used multiple times and reject the additional usage attempts. Since usage of a token incurs resource costs to a server (e.g., bandwidth and storage) and a Content Provider may have a limited budget (see Section 4.5), the Content Provider should be able to indicate if a token may be used multiple times. It SHOULD be possible to revoke tokens after they are generated. This could be accomplished by supplying the server the unique identifiers of the tokens which are to be revoked. 6.2.3. Status Information DRP SHOULD provide a status request service that clients can use to request status information of a server. 22.214.171.124. Status Information on a specific server Access to such status information SHOULD require client authorization; that is, clients need to be authorized to access the requested status information. This authorization is based on the user delegation concept as described in Section 4.5. The following status information elements SHOULD be obtained: o List of associated data objects (with properties); o Resources used/available. The following information elements MAY additionally be available: o List of servers to which data objects have been distributed (in a certain time-frame); o List of clients to which data objects have been distributed (in a certain time-frame). For the list of servers/clients to which data objects have been distributed to, the server SHOULD be able to decide on time bounds for which this information is stored and specify the corresponding Alimi, et al. Expires August 18, 2013 [Page 21] Internet-Draft DECADE Architecture February 2013 time frame in the response to such requests. Some of this information may be used for accounting purposes, e.g., the list of clients to which data objects have been distributed. 126.96.36.199. Access information on a specific server Access information MAY be provided for accounting purposes, for example, when Content Providers are interested in access statistics for resources and/or to perform accounting per user. Again, access to such information requires client authorization and SHOULD based on the delegation concept as described in Section 4.5. The following type of access information elements MAY be requested: o What data objects have been accessed by whom and for how many times; o Access tokens that a server as seen for a given data object. The server SHOULD decide on time bounds for which this information is stored and specify the corresponding time frame in the response to such requests. 6.2.4. Data Object Attributes Data Objects that are stored on a DECADE server SHOULD have associated attributes (in addition to the object identifier and data object) that relate to the data storage and its management. These attributes may be used by the server (and possibly the underlying storage system) to perform specialized processing or handling for the data object, or to attach related server or storage-layer properties to the data object. These attributes have a scope local to a server. In particular, these attributes SHOULD NOT be applied to a server or client to which a data object is copied. Depending on authorization, clients SHOULD be permitted to get or set such attributes. This authorization is based on the delegation concept as described in Section 4.5. The architecture does not limit the set of permissible attributes, but rather specifies a set of baseline attributes that SHOULD be supported: Expiration Time: Time at which the data object can be deleted; Data Object size: In bytes; Media type Labelling of type as per [RFC4288]; Alimi, et al. Expires August 18, 2013 [Page 22] Internet-Draft DECADE Architecture February 2013 Access statistics: How often the data object has been accessed (and what tokens have been used). The data object attributes defined here are distinct from application metadata (see Section 4.1). Application metadata is custom information that an application might wish to associate with a data object to understand its semantic meaning (e.g., whether it is video and/or audio, its playback length in time, or its index in a stream). If an application wishes to store such metadata persistently, it can be stored within data objects themselves. 6.3. Standard Data Transfer (SDT) Protocol A DECADE server will provide a data access interface, and the SDT will be used to write data objects to a server and to read (download) data objects from a server. Semantically, SDT is a client-server protocol; that is, the server always responds to client requests. 6.3.1. Writing/Uploading Objects To write a data object, a client first generates the object's name (see Section 6.1), and then uploads the object to a server and supplies the generated name. The name can be used to access (download) the object later; for example, the client can pass the name as a reference to other client that can then refer to the object. Data objects can be self-contained objects such as multimedia resources, files etc., but also chunks, such as chunks of a P2P distribution protocol that can be part of a containing object or a stream. The application that originates the data objects generates DECADE object names according to the naming specification in Section 6.1. Clients (as parts of application entities) upload a named object to a server. If supported, a server can verify the integrity and other security properties of uploaded objects. 6.3.2. Downloading Data Objects A client can request named data objects from a server. In a corresponding request message, a client specifies the object name and a suitable access and resource control token. The server checks the validity of the received token and its associated resource usage- related properties. If the named data object exists on the server and the token can be validated, the server delivers the requested object in a response Alimi, et al. Expires August 18, 2013 [Page 23] Internet-Draft DECADE Architecture February 2013 message. If the data object cannot be delivered the server provides an corresponding status/reason information in a response message. Specifics regarding error handling, including additional error conditions (e.g., overload), precedence for returned errors and its relation with server policy, are deferred to eventual protocol specification. 6.4. Server-to-Server Protocols An important feature of a DECADE system is the capability for one server to directly download data objects from another server. This capability allows Applications to directly replicate data objects between servers without requiring end-hosts to use uplink capacity to upload data objects to a different server. DRP and SDT will support operations directly between servers. Servers are not assumed to trust each other nor are configured to do so. All data operations are performed on behalf of clients via explicit instruction. However, the objects being processed do not necessarily have to originate or terminate at the client (i.e., the data object might be limited to being exchanged between servers even if the instruction is triggered by the client). Clients thus will be able to indicate to a server the following additional parameters: o Which remote server(s) to access; o The operation to be performed; o The Content Provider at the remote server from which to retrieve the data object, or in which the object is to be stored; and o Credentials indicating access and resource control to perform the operation at the remote server. Server-to-server support is focused on reading and writing data objects between servers. The data object referred to at the remote server is the same as the original data object requested by the client. Object attributes (see Section 6.2.4) might also be specified in the request to the remote server. In this way, a server acts as a proxy for a client, and a client can instantiate requests via that proxy. The operations will be performed as if the original requester had its own client co-located with the server. Alimi, et al. Expires August 18, 2013 [Page 24] Internet-Draft DECADE Architecture February 2013 When a client sends a request to a server with these additional parameters, it is giving the server permission to act (proxy) on its behalf. Thus, it would be prudent for the supplied token to have narrow privileges (e.g., limited to only the necessary data objects) or validity time (e.g., a small expiration time). In the case of a retrieval operation, the server is to retrieve the data object from the remote server using the specified credentials, and then optionally return the object to a client. In the case of a storage operation, the server is to store the object to the remote server using the specified credentials. The object might optionally be uploaded from the client or might already exist at the proxy server. 7. Security Considerations In general, the security considerations mentioned in [RFC6646] apply to this document as well. A DECADE system provides a distributed storage service for content distribution and similar applications. The system consists of servers and clients that use these servers to upload data objects, to request distribution of data objects, and to download data objects. Such a system is employed in an overall application context -- for example in a P2P Content Distribution Application, and it is expected that DECADE clients take part in application-specific communication sessions. The security considerations here focus on threats related to the DECADE system and its communication services, i.e., the DRP/SDT protocols that have been described in an abstract fashion in this document. 7.1. Threat: System Denial of Service Attacks A DECADE network might be used to distribute data objects from one client to a set of servers using the server-to-server communication feature that a client can request when uploading an object. Multiple clients uploading many objects at different servers at the same time and requesting server-to-server distribution for them could thus mount massive distributed denial of service (DDOS) attacks, overloading a network of servers. This threat is addressed by the server's access control and resource control framework. Servers can require Application End-Points to be authorized to store and to download objects, and Application End- Points can delegate authorization to other Application End-Points Alimi, et al. Expires August 18, 2013 [Page 25] Internet-Draft DECADE Architecture February 2013 using the token mechanism. Of course the effective security of this approach depends on the strength of the token mechanism. See below for a discussion of this and related communication security threats. Denial of Service Attacks against a single server (directing many requests to that server) might still lead to considerable load for processing requests and invalidating tokens. SDT therefore MUST provide a redirection mechanism as described as a requirement in [I-D.ietf-decade-reqs]. 7.2. Threat: Protocol Security 7.2.1. Threat: Authorization Mechanisms Compromised A DECADE system does not require Application End-Points to authenticate in order to access a server for downloading objects, since authorization is not based on End-Point or user identities but on the delegation-based authorization mechanism. Hence, most protocol security threats are related to the authorization scheme. The security of the token mechanism depends on the strength of the token mechanism and on the secrecy of the tokens. A token can represent authorization to store a certain amount of data, to download certain objects, to download a certain amount of data per time etc. If it is possible for an attacker to guess, construct or simply obtain tokens, the integrity of the data maintained by the servers is compromised. This is a general security threat that applies to authorization delegation schemes. Specifications of existing delegation schemes such as OAuth [RFC5849] discuss these general threats in detail. We can say that the DRP has to specify appropriate algorithms for token generation. Moreover, authorization tokens should have a limited validity period that should be specified by the application. Token confidentiality should be provided by application protocols that carry tokens, and the SDT and DRP should provide secure (confidential) communication modes. 7.2.2. Threat: Data Object Spoofing In a DECADE system, an Application End-Point is referring other Application End-Points to servers to download a specified data objects. An attacker could "inject" a faked version of the object into this process, so that the downloading End-Point effectively receives a different object (compared to what the uploading End-Point provided). As result, the downloading End-Point believes that is has Alimi, et al. Expires August 18, 2013 [Page 26] Internet-Draft DECADE Architecture February 2013 received an object that corresponds to the name it was provided earlier, whereas in fact it is a faked object. Corresponding attacks could be mounted against the application protocol (that is used for referring other End-Points to servers), servers themselves (and their storage sub-systems), and the SDT by which the object is uploaded, distributed and downloaded. A DECADE systems fundamental mechanism against object spoofing is name-object binding validation, i.e., the ability of a receiver to check whether the name he was provided and that he used to request an object, actually corresponds to the bits he received. As described above, this allows for different forms of name-object binding, for example using hashes of data objects, with different hash functions (different algorithms, different digest lengths). For those application scenarios where hashes of data objects are not applicable (for example live-streaming) other forms of name-object binding can be used (see Section 6.1). This flexibility also addresses cryptographic algorithm evolvability: hash functions might get deprecated, better alternatives might be invented etc., so that applications can choose appropriate mechanisms meeting their security requirements. DECADE servers MAY perform name-object binding validation on stored objects, but Application End-Points MUST NOT rely on that. In other words, Application End-Points SHOULD perform name-object binding validation on received objects. 8. IANA Considerations This document does not have any IANA considerations. 9. Acknowledgments We thank the following people for their contributions to and/or detailed reviews of this document: Carsten Bormann David Bryan Dave Crocker Yingjie Gu David Harrington Alimi, et al. Expires August 18, 2013 [Page 27] Internet-Draft DECADE Architecture February 2013 Hongqiang (Harry) Liu David McDysan Borje Ohlman Konstantinos Pentikousis Martin Stiemerling Richard Woundy Ning Zong 10. References 10.1. Normative References [RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC6646] Song, H., Zong, N., Yang, Y., and R. Alimi, "DECoupled Application Data Enroute (DECADE) Problem Statement", RFC6646, July 2012. [I-D.ietf-decade-reqs] Yingjie, G., Bryan, D., Yang, Y., Zhang, P., and R. Alimi, "DECADE Requirements", draft-ietf-decade-reqs-08 (work in progress), August 2012. [I-D.farrell-decade-ni] Farrell, S., Kutscher, D., Dannewitz, C., Ohlman, B., Keraenen, A., and P. Hallam-Baker, "Naming Things with Hashes", draft-farrell-decade-ni-10 (work in progress), August 2012. 10.2. Informative References [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and Registration Procedures", RFC4288, December 2005. [RFC5661] Shepler, S., Eisler, M., and D. Noveck, "Network File System (NFS) Version 4 Minor Version 1 Protocol", RFC5661, January 2010. [RFC5849] Hammer-Lahav, E., "The OAuth 1.0 Protocol", RFC5849, April 2010. Alimi, et al. Expires August 18, 2013 [Page 28] Internet-Draft DECADE Architecture February 2013 [RFC6392] Alimi, R., Rahman, A., and Y. Yang, "A Survey of In- Network Storage Systems", RFC6392, October 2011. [RFC6749] Hardt, D., "The OAuth 2.0 Authorization Framework", RFC6749, October 2012. [OpenFlow] "OpenFlow Organization", <http://www.openflow.org/>. [GoogleFileSystem] Ghemawat, S., Gobioff, H., and S. Leung, "The Google File System", SOSP 2003, October 2003. Appendix A. In-Network Storage Components Mapped to DECADE Architecture In this section we evaluate how the basic components of an in-network storage system identified in Section 3 of [RFC6392] map into a DECADE system. A.1. Data Access Interface Clients can read and write objects of arbitrary size through the client's Data Controller, making use of a SDT. A.2. Data Management Operations Clients can move or delete previously stored objects via the client's Data Controller, making use of a SDT. A.3. Data Search Capability Clients can enumerate or search contents of servers to find objects matching desired criteria through services provided by the Content Distribution Application (e.g., buffer-map exchanges, a DHT, or peer- exchange). In doing so, Application End-Points might consult their local Data Index in the client's Data Controller. A.4. Access Control Authorization All methods of access control are supported: public-unrestricted, public-restricted and private. Access Control Policies are generated by a Content Distribution Application and provided to the client's Resource Controller. The server is responsible for implementing the access control checks. Alimi, et al. Expires August 18, 2013 [Page 29] Internet-Draft DECADE Architecture February 2013 A.5. Resource Control Interface Clients can manage the resources (e.g., bandwidth) on the DECADE server that can be used by other Application End-Points. Resource Sharing Policies are generated by a Content Distribution Application and provided to the client's Resource Controller. The server is responsible for implementing the resource sharing policies. A.6. Discovery Mechanism The particular protocol used for discovery is outside the scope of this document. However, options and considerations have been discussed in Section 5.5. A.7. Storage Mode Servers provide an object-based storage mode. Immutable data objects might be stored at a server. Applications might consider existing blocks as data objects, or they might adjust block sizes before storing in a server. Appendix B. Hisotry To RFC Editor: This section is informational for you. Please remove this section before publication. Since version 10, this document was modified based on the previous DECADE WG architecture document , and was extended to be a protocol specification. It addresses the comments from the WG and the responsible ADs (David Harrington and then Martin Stiemerling). The authors now request to publish this document through the independent stream and get the support of Martin. Authors' Addresses Richard Alimi Google Email: email@example.com Akbar Rahman InterDigital Communications, LLC Email: firstname.lastname@example.org Alimi, et al. Expires August 18, 2013 [Page 30] Internet-Draft DECADE Architecture February 2013 Dirk Kutscher NEC Email: email@example.com Y. Richard Yang Yale University Email: firstname.lastname@example.org Haibin Song Huawei Email: email@example.com Kostas Pentikousis Huawei Email: firstname.lastname@example.org Alimi, et al. Expires August 18, 2013 [Page 31]