Category Archives: Cloud Programming

Build GCP Service Client Authentication

A client application must be authenticated to use any Google Cloud platform service through its REST API; a common and important first step for all the services. This post shows how to create a Java application which encapsulates the necessary authentication logic  so you do not have to recreate it time and time again with the possibility of making mistakes.   For simplicity, the example shows how to authenticate command line (aka, native) client applications and authorize their access to Google Cloud Platform services. At this time the app creates authenticated clients for the following services: Google Storage, Google Drive, YouTube, and BigQuery.

This post also contains important background information that you need to know to use Google Cloud service APIs. We suggest you take look before you proceed at Background Information.

Authentication App Architecture

The Authentication app is a Java application built as a Maven project. With Maven you can define all the up-to-date dependencies by linking to the necessary Google libraries on-line.  For more information see GCP Cloud Service Client Apps – Common Tasks.

Find reference information for the Google APIs libraries at Supported Google APIs (Java) . Find latest info at the Maven Repository and search for the specific Google library

The authentication application described in this post has the following architecture:

 

  1. IGoogleClientAuthentication. Defines variables and methods to authenticate clients so they can use Google service REST APIs.
  2. GoogleServiceClientAuthentication. This is an abstract class which contains the actual logic to obtain the credentials for the client application so it can use the requested Google service REST API. The class uses Google OAuth 2.0 authorization code flow that manages and persists end-user credentials.
  3. AuthenticateGoogleServiceClient. This class  extends GoogleServiceClientAuthentication and implements IGoogleClientAuthentication. It creates an authenticated client object that is authorized to access the selected Google service API.
    Based on the caller’ selection, it allows the creation of an authorized service to access  Google service APIs such as Google Cloud Storage API or Google Drive API.

The class assumes that you already have created a directory to store the file with the client secrets. For example .googleservices/storage. The file containing the secrets is client_secrets.json.

Authentication App Workflow

The following figure shows the example application workflow:

The client application calls the authentication method for the service selected by the user passing the scope information.  The AuthenticateGoogleServiceClient  class performs all the steps to create an authenticated client that is authorized to use the Google service REST API, in particular it performs the following:

  • Reads the client secrets. You must store these secrets in a local file, before using the application  You obtain the secretes through the Google developers console and downloading the related JSON information (for native applications) from your service project.  The file name used in the example is client_secrets.json, you can use any other name as long as you use the json suffix. For details about the file name, directory names, see the code comments.
  • Uses Google OAuth2 to obtain the authorized service object. The first time you run the application, a browser instance is created to ask you as the project owner to grant access permission to the client. From then on, the credentials are stored in a file named StoredCredential.  The name of this file is predefined in the StoredCredential class. This file is stored in the same directory where the client_secrets.json is stored. See the code comments for details. If you delete the StoredCredential file, the resource owner is asked to grant access again.
  • Google OAuth2 returns the authenticated service object to the AuthenticateGoogleServiceClient which, in turn, returns it to the client application. The client can then use the authenticated object to use the Google service REST API. For example, in case of the Google Storage service, it can  list buckets in the project, create buckets, create objects in a bucket, list objects in a bucket and so on.

Background Information

Enable a Google Service API

In order to use a service API in your application, you must enable it as shown next.

Continue reading

GCP Cloud Service Client Apps – Common Tasks

The following are some common tasks that you must perform when using Google Cloud Service APIs such as enabling a service API, installing an API client library, performing client authentication, and so on.

Prerequisites

  1. Eclipse Version 4.xx. Before installing Eclipse assure that you Java installed (at the least the JRE). To download Java development environment go to Java SE at a Glance.
  2. Maven plugin installed. Make sure to set your Eclipse preferences as follows:
    • Within Eclipse, select Window > Preferences (or on Mac, Eclipse > Preferences).
    • Select Maven and select the following options:
      • “Download Artifact Sources”
      • “Download Artifact JavaDoc”

Create a Maven Project

  1. In Eclipse, select File->New->Project. The Select a wizard dialog window is displayed
  2. Expand the Maven folder and select Maven ProjectClient Auth Maven Project
  3. Click Next.
  4. In the next dialog window, check Create a simple project (skip archetype selection).
  5. Click Next. The New Maven project dialog is displayed.
  6. Enter the Group Id information, for instance com.clientauth.
  7. Enter the Artifact Id (use the name of the project) for instance ClientAuth.
  8. Accept the Version default 0.0.1-SNAPSHOT. Or assign a new version such as 1.0.
  9. Assure that the Packaging is jar.
  10. Enter the project name, for example ClientAuthentication.
  11. Click Finish.
    This creates a default pom.xml file that you will use to define your application dependencies as shown next.

Define Dependencies in pom.xml

To the default pom.xml, you must add the dependencies specific to your application. The example shown next refers to a console application which uses Google Storage service. To obtain the latest dependencies (aka artifacts) information, perform the following steps:

OAuth2 API Dependency

  1. In your browser, navigate to https://developers.google.com/api-client-library/java/apis/.
  2. In the page, click Ctrl-F and in the box enter oauth2. This will take you to the row containing the OAuth2 library info.
  3. Click on the version link, let’s say v2. This displays the  Google OAuth2 API Client Library for Java page.
  4. At the bottom, in the Add Library to Your Project section, click on the Maven tab. This displays the dependencies information similar to the following:
    <project>
      <dependencies>
        <dependency>
          <groupId>com.google.apis</groupId>
          <artifactId>google-api-services-oauth2</artifactId>
          <version>v2-rev126-1.22.0</version>
        </dependency>
      </dependencies>
    </project>
  5. Copy and paste the <dependency> section in the pom.xml file.
  6. If you want to refer to other versions of the API library click on the link at the bottom of the page. See all versions available on the Maven Central Repository.
  7. You can define the version in a parametric way as follows:
    <version>${project.oauth.version}</version>
    

    Where the

    ${project.oauth.version}

    is defined in the properties section as follows:

    <properties>
     <project.http.version>1.22.0</project.http.version>
     <project.oauth.version>v2-rev126-1.22.0</project.oauth.version>
     <project.storage.version>v1-rev105-1.22.0</project.storage.version>
     <project.guava.version>21.0</project.guava.version>
     <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    So the new format is as follows:

    <dependency>
      <groupId>com.google.apis</groupId>
      <artifactId>google-api-services-oauth2</artifactId>
      <version>${project.oauth.version}</version>
    </dependency>

Guava Dependency

Guava is a suite of core and expanded libraries that include utility classes, google’s collections, io classes, and much much more.

  1. In your browser, navigate to https://mvnrepository.com/.
  2. In the search box, enter the name of the API library  google guava. 
  3. Click on the tile of the library Guava: Google Core Libraries For Java.
  4. In the displayed page click on the required version.
  5. Click on the Maven tab.
  6. Check the Include comment …. box
  7. Click on the box. This will copy the content to the clipboard.
  8. Paste the content in the pom file

Managing Dependencies

The Guava library version might conflict with the OAuth2 library version.  In order to avoid the conflict we need to add a dependencyManagement section to the pom.xml file. Follow these steps:

  1. In Eclipse, in the pom.xml editor, click on the Dependencies tag.
  2. Click on the Manage button.
  3. In the left pane, select the Guava and OAuth libraries.
  4. Click the Add button. This create the dependencyManagement section. The following shows an example:
    <dependencyManagement>
      <dependencies>
        <dependency>
          <groupId>com.google.apis</groupId>
          <artifactId>google-api-services-oauth2</artifactId>
          <version>${project.oauth.version}</version>
       </dependency>
       <dependency>
         <groupId>com.google.guava</groupId>
         <artifactId>guava</artifactId>
         <version>${project.guava.version}</version>
       </dependency>
     </dependencies>
    </dependencyManagement>

HTTP Dependency

This library is needed to allow a Java application to make HTTP asynchronous requests over the network through the REST API of the cloud service it uses for example Google Storage.

  1. In your browser, navigate to https://mvnrepository.com/.
  2. In the search box, enter the name of the library google http client. 
  3. Click on the tile of the library Google HTTP Client Library For Java.
  4. In the displayed page click on the required version.
  5. Click on the Maven tab.
  6. Check the Include comment …. box
  7. Click on the box. This will copy the content to the clipboard.
  8. Paste the content in the pom file

Jackson Extensions to HTTP Library Dependency

This library is needed to allow a Java application to perform XML and JSON parsing.

  1. In your browser, navigate to https://mvnrepository.com/.
  2. In the search box, enter the name of the library google http client. 
  3. Click on the tile of the library  Jackson 2 Extensions To The Google HTTP Client Library For Java.
  4. In the displayed page click on the required version (the same you used for the HTTP library).
  5. Click on the Maven tab.
  6. Check the Include comment …. box
  7. Click on the box. This will copy the content to the clipboard.
  8. Paste the content in the pom file

Google Storage API Dependency

  1. In your browser, navigate to https://developers.google.com/api-client-library/java/apis/.
  2. In the page, click Ctrl-F and in the box enter cloud storage. This will take you to the row containing the Cloud Storage library info.
  3. Click on the version link, let’s say v1. This displays the Cloud Storage JSON API Client Library for Java  page.
  4. At the bottom, in the Add Library to Your Project section, click on the Maven tab.
  5. Copy and paste the dependency section in the pom.xml file.
Once you have updated the pom, make sure to update the project by right-clicking on the project name then selecting
Maven->Update Project…

Import a Maven Project

  1. Download the archived project from the specified location.
  2. Unzip the downloaded archive.
  3. In Eclipse, create a work space or use an existing one.
  4. Click OK.
  5. Click File->Import.
  6. In the wizard window, select Maven->Existing Maven Projects.
    SelectMavenProjects
  7. Click Next.
  8. Navigate (click the Browse… button), to the location containing the unzipped code archive. The following is an example of a project to import:
    Import Maven Projects
  9. Click OK. You get a window similar to this:
    Imported Maven Projects
  10. Click Finish.

What Can Go Wrong?

Local JARs

You may have local JARs that must be added to the project path. If they are not included you can have errors similar to this: JAR_Error.

To solve this kind of problems perform the following steps:

  1. In Eclipse, in the Package Explorer, right click on the project name.
  2. Navigate to Properties->Java Build Path.
  3. Click on the Libraries tag.
  4. Click the Add JARs… button
  5.  Select your local JAR, from the lib folder for example, and click OK.
    You will get a window similar to the following:
    Local Jar
  6. Click OK.
    The error should disappear from the list in the Problems window.

Execution Environment

You could get a warning about the execution environment similar to the following:

Execution Warning

To solve this kind of problems perform the following steps:

  1. In Eclipse, in the Package Explorer, right click on the project name.
  2. Navigate to Properties->Java Build Path.
  3. Click on the Libraries tag.
  4. Select the current JRE System Library.
  5. Click the Remove button.
  6. Click the Add Library… button.
  7. Select the JRE System Library.
  8. Click Next.
  9. Click Finish. The new JRE System Library version should be listed.
  10. Click OK.
    The warning should disappear from the list in the Problems window.

Compiler Version

You could get an error about the compiler version similar to the following:

Compiler Error

To solve this kind of problems perform the following steps:

  1. In Eclipse, in the Package Explorer, right click on the project name.
  2. Navigate to Properties->Java Compiler.
  3. In the right pane, uncheck Enable project specific settings.
  4. Click the link Configure Workspace Settings….
  5. In the next window, select version 1.8 or higher.
  6. Check Use default compliance settings.
  7. Click OK.
  8. Click OK.
  9. Click Yes, in the popup asking to recompile the project.
    The error should disappear from the list in the Problems window.

Create Runnable JAR

  1. In Eclipse, in the Package Explorer, right click on the project name.
  2. Click Export.
  3. Expand the Java folder.
  4. Click Runnable JAR file.
  5. Click Next.
  6. In the Launch configuration, select the one applicable to the project.
    This is the configuration you define to run the application in Eclipse.
  7. In the Export destination enter or browse to the location where to store the JAR and enter the name for the JAR file.
    JAR Runnable
  8. Click Finish.
  9. To execute the application, open a terminal window.
  10. Change the directory where the JAR is located.
  11. Enter a command similar to the following:
      java -jar google-drive-client.jar
    

See Also

 

OAuth in a Nutshell

To access a web service through its REST API, a client application must first be authenticated. OAuth client libraries, such as Google OAuth Java client library com.google.api.client.auth.oauth2, provide functions a client can use to be authenticated and authorized to access a Google Cloud service using the related REST API. For additional information, see:  OAuth 2.0 and the Google OAuth Client Library for Java.

Background

To understand how OAuth works, we first need to understand the following:

  • OAuth 2.0. Standard specification for allowing end users to securely authorize a client to access protected server-side resources.
  • OAuth 2.0 bearer token. Specification which explains how to access those protected resources using an access token granted during the cient authorization process.

OAuth 2.0

The OAuth 2.0 authorization framework enables a third-party application (client) to obtain limited access to a web service in one of the following ways:

  • On behalf of a resource owner by orchestrating an approval interaction between the resource owner and the web service.
  • By allowing the application to obtain access on its own behalf.

 The information in this section has been extrapolated from the OAuth 2.0 specification.

OAuth 2.0 Roles

OAuth 2.0 defines the following roles:

  1. Client. An application making protected resource requests on behalf of the resource owner and with its authorization.  The term client does not imply any particular implementation characteristics (e.g. whether the application executes on a server, a desktop, or other devices).
  2. Resource owner. An entity capable of granting access to a protected resource. When the resource owner is a person, it is referred to as an end-user.
  3. Authorization server. The server issuing access tokens to the client after successfully authenticating the resource owner and obtaining authorization.
  4. Resource server. The server hosting the protected resources, capable of accepting and responding to protected resource requests using access tokens.

The authorization server may be the same server as the resource server or a separate entity. A single authorization server may issue access tokens accepted by multiple resource servers.

OAuth 2.0 Protocol Flow

The following diagram shows the interactions between the various roles (actors) involved in the authentication and authorization process.

OAuth Flow

Fig 1  OAuth 2.0 Protocol Flow

  1. The client requests authorization from the resource owner.  The authorization request can be made directly to the resource owner (as shown), or preferably indirectly via the authorization server as an intermediary.
  2. The client receives an authorization grant, which is a credential representing the resource owner’s authorization, expressed using one of four grant types defined in the OAuth 2.0 specification or using an extension grant type.  The authorization grant type depends on the method used by the client to request authorization and the types supported by the authorization server.
  3. The client requests an access token by authenticating with the authorization server and presenting the authorization grant.
  4. The authorization server authenticates the client and validates the authorization grant, and if valid issues an access token.
  5. The client requests the protected resource from the resource server and authenticates by presenting the access token.
  6. The resource server validates the access token, and if valid, serves the request.

Terminology

Before moving ahead, let’s define some terms.

  • Authorization Grant. An authorization grant is a credential representing the resource owner’s authorization to access its protected resources. A client uses the grant to obtain an access token. The OAuth 2.0 specification defines the following grant types:
    • Authorization code. The authorization code is obtained by using an authorization server as an intermediary between the client and the resource owner.
    • Implicit.The implicit grant is a simplified authorization code flow optimized for clients implemented in a browser using a scripting language such as JavaScript.
    • Resource owner password credentials. The resource owner password credentials (i.e. username and password) can be used directly as an authorization grant to obtain an access token.
    • Client credentials.The client credentials (or other forms of client authentication) can be used as an authorization grant when the authorization scope is limited to the protected resources under the control of the client, or to protected resources previously arranged with the authorization server.
  • Access Token. Access tokens are credentials used to access protected resources. An access token is a string representing an authorization issued to the client. The string is usually opaque to the client. Tokens represent specific scopes and durations of access, granted by the resource owner, and enforced by the resource server and authorization server.
  • Refresh Token. Refresh tokens are credentials used to obtain access tokens. Refresh tokens are issued to the client by the authorization server and are used to obtain a new access token when the current access token becomes invalid or expires, or to obtain additional access tokens with identical or narrower scope (access tokens may have a shorter lifetime and fewer permissions than authorized by the resource owner). Issuing a refresh token is optional at the discretion of the authorization server.

More About Refresh Token

If the authorization server issues a refresh token, it is included when issuing an access token. A refresh token is a string representing the authorization granted to the client by the resource owner. The string is usually opaque to the client. The token denotes an identifier used to retrieve the authorization information. Unlike access tokens, refresh tokens are intended for use only with authorization servers and are never sent to resource servers.

OAuth Flow Refresh Token

Fig 2  OAuth 2.0 Protocol Flow with Refresh Token

  1. The client requests an access token by authenticating with the authorization server, and presenting an authorization grant.
  2. The authorization server authenticates the client and validates the authorization grant, and if valid issues an access token and a refresh token.
  3. The client makes a protected resource request to the resource server by presenting the access token.
  4. The resource server validates the access token, and if valid, serves the request.
  5. Steps 3 and 4 repeat until the access token expires. If the client knows the access token expired, it skips to step 7, otherwise it makes another protected resource request.
  6. Since the access token is invalid, the resource server returns an invalid token error.
  7. The client requests a new access token by authenticating with the authorization server and presenting the refresh token. The client authentication requirements are based on the client type and on the authorization server policies.
  8. The authorization server authenticates the client and validates the refresh token, and if valid issues a new access token (and optionally, a new refresh token).

Cloud Programming REST API

Cloud Programming REST API is at the core of an application that interacts with any of the cloud service types (IaaSPaaS, or Saas). These APIs are an important component of the cloud architecture. They abstract out the infrastructure and protocol details and allow you to communicate with your selected service.

To use these APIs effectively, you must understand the principles on which they are built. To this end, when you think cloud, you must think Web and the underlying protocol used to exchange information. We are all familiar with accessing a page on the Web using a browser. Well, when you access a Web page a request is issued from your computer (client browser) to the Web site (hosting server) using the HTTP protocol.

REST

The majority of the cloud programming REST APIs are built on the HTTP protocol where the Web is the platform. For more information, see HTTP 1.1 rfc 2616.

More specifically, the majority of the cloud services (and related APIs) are designed using an architectural style known as Representational State Transfer (REST). This style is widely used and is a simpler alternative to SOAP and related Web Services Description Language (WSDL).

Capitalizing on the Web success and based on its semantics, REST formalizes a set of principles by which you can design cloud services to access system’s resources, including how resource states are addressed and transferred over HTTP by a wide range of clients. As first described by Roy Fielding in his seminal thesis Architectural Styles and the Design of Network-based Software Architectures, REST is a set of software architectural principles that use the Web as a platform for distributed computing. Since then, REST has emerged as the predominant Web service design model.

Representational State Transfer

The Web is composed of resources. A resource is any item worth to be exposed. For example, the Sunshine Bakery Inc. may define a chocolate truffle croissant resource. Clients may access that resource with this URL:

http://www.sunshinebakery.com/croissants/croissant/chocolate-truffle

representation of the resource is returned for example, croissant-chocolate-truffle.html. The representation places the client application in a state. If the client traverses a hyperlink in the page, the new representation places the client application into another state. As a result, the client application changes (transfers) state with each resource representation

REST Design Principles

In its essence, a REST service follows these design principles:

  1. Proper use of HTTP methods.
  2. Stateless design.
  3. Directory like URIs.
  4. XML or JavaScript Object Notation (JSON) format.

Proper Use of HTTP Methods

A key design principle of a RESTful service is the proper use of HTTP methods that follows the protocol as defined in the RFC 2616.

For example, HTTP GET is a data-producing method that is intended to be used by a client to perform one of the following operations:

  1. Retrieve a resource.
  2. Obtain data from a Web service.
  3. Execute a query with the expectation that the Web service will look for and respond with a set of matching resources.

REST guidelines instruct the developers to use HTTP methods explicitly and in a way that is consistent with the protocol definition. This means a one-to-one mapping between create, read, update, and delete (CRUD) operations and HTTP methods as follows:

  1. Retrieve a resource with GET.
  2. Create a resource on the service with POST.
  3. Update a resource with PUT.
  4. Remove a resource with DELETE.

Create a Resource (POST)

The correct way to create a resource is by using the HTTP POST method. All the parameter names and values are contained in XML tags. The payload, an XML representation of the entity to create, is sent in the body of an HTTP POST whose request URI is the intended parent of the entity as shown in the next example.

Host: myserver
Content-Type: application/xml
<?xml version="1.0">
<croissant>
  <name>chocolate-truffle</name>
</croissant>

The previous example shows a correct RESTful request which uses HTTP POST and includes the payload in the body of the request.

On the service, the request may be processed by adding the resource contained in the body as a subordinate of the resource identified in the request URI; in this case the new resource should be added as a child of /croissants. This containment relationship between the new entity and its parent, as specified in the POST request, is analogous to the way a file is subordinate to its parent directory. The client sets up the relationship between the entity and its parent and defines the new entity’s URI in the POST request.

Retrieve a Resource (GET)

A client application may get a representation of the resource (chocolate-truffle) using the URI, noting that at least logically the resource is located under /croissant, as shown in the next GET request.

GET /croissant/chocolate-truffle HTTP/1.1
Host: myserver
Accept: application/xml

This is an explicit use of GET which is for data retrieval only. GET is an operation that must be free of side effects, this property is also known as idempotence.

 Warning

Some APIs use GET to trigger transactions on the server for example, to add records to a database. In these cases the GET request URI is not used properly as shown next:

GET /addcroissant?name=chocolate-truffle HTTP/1.1

This is an incorrect design because GET request has side effects. If successfully processed, the result of the request is to add a new croissant type to the data store.  The following are the problems with this design:

  1. Semantic problem. Cloud services are designed to respond to HTTP GET requests by retrieving resources that match the path (or the query criteria) in the request URI and return a representation in a response, not to add a record to the database. This is an incorrect use of GET that is not compliant with of HTTP/1.1 protocol, using GET.

  2. Unintentional side effects. By triggering a change in the server-side state, could unintentionally allow Web caching tools (crawlers) and search engines to make server-side changes simply by crawling a link.

Update a Resource (PUT)

You use HTTP PUT request to update the resource, as shown in the following example:

Content-Type: application/xml
<?xml version="1.0">
<croissant>
  <name>chocolate-bonbon&lt</name>
</croissant>

The use of PUT to update the original resource provides a clean interface consistent with the definition of HTTP methods. The PUT request is proper for the following reasons:

  1. It identifies the resource to update in the URI.
  2. The client transfers a new representation of the resource in the body of the request.

Good Design Practices

  1. A well designed cloud service uses HTTP methods explicitly followed by nouns in URIs. The verbs POST, GET, PUT, and DELETE are all that is needed and are already defined by the protocol.  This allows clients to be explicit about the operations they invoke. The API should not define more verbs or remote procedures, such as /addcroissant or /updatecroissant.

  2. The body of an HTTP request transfers resource state, it does not carry the name of a remote method or remote procedure to be invoked.

Stateless Design

Cloud REST services need to scale to meet high performance demands. Clusters of servers with load-balancing and failover capabilities, proxies, and gateways allow requests to be forwarded from one server to the other to decrease the overall response time of a service call. Using intermediary servers to improve scale requires clients to send complete, independent requests which include all data needed by a request so that the components in the intermediary servers may forward, route, and load-balance without any state being held locally in between requests.

 A complete, independent request doesn’t require the service to retrieve any kind of application context or state. A client application includes within the HTTP headers and body of a request all the parameters, context, and data needed by the server-side component to generate a response.

A stateless service not only performs better, it shifts most of the responsibility of maintaining state to the client application. In a REST service, the server is responsible for generating responses and for providing an interface that enables the client to maintain the application state. For example, in the request for a multipage response, the client should include the actual page number to retrieve instead of simply asking for the next page as shown in the next figure.

A stateless service generates a response that links to the next page number in the set and lets the client do what it needs to keep this value around. This service design can be divided into two groups of responsibilities that clarifies how a stateless service can be created:

Service Responsibilities

  • Generate responses that include links to other resources. This allows client applications to navigate between related resources. This type of response embeds links.  If the request is for a parent or container resource, a typical REST response might also include links to the parent’s children or subordinate resources so that these remain connected.

  • Generate responses that indicate whether they are cacheable. This improves performance by reducing the number of requests and by eliminating some requests entirely. The service does this by including a Cache-Control and Last-Modified (a date value) HTTP response header.

Client Responsibilities

  • Use the Cache-Control response header. This determines whether to cache the resource (make a local copy). The client also reads the Last-Modified response header and sends back the date value in an If-Modified-Since header to ask the service if the resource has changed. In this conditional GET,  the service’s response is a standard 304 code (Not Modified) which omits the actual resource requested if it has not changed since last time it was modified. The client can safely use the cached resource, bypassing subsequent GET requests.

  • Send complete requests that can be serviced independently of other requests. The client uses HTTP headers as specified by the service and send complete representations of resources in the request body. The client sends requests that make very few assumptions about prior requests, the existence of a session on the server, the server’s ability to add context to a request, or about application state that is kept in between requests.

This collaboration between client application and service is essential to for a  stateless service. It improves performance by saving bandwidth and minimizing server-side application state.

Directory like URIs

The URIs determine how intuitive the REST service is and whether the service is going to be used in ways that the designers can anticipate.  URIs should be intuitive to the point where they are easy to guess. Ideally a URI should be self-documenting and must requires little explanation to understand how to access resources. In other words, a URI should be straightforward, predictable, and easy to understand.

One way to achieve this level of usability is to define directory like URIs that is the URI is hierarchical, rooted in a single path. Its branches are subpaths that expose the service’s main areas.  A URI is a tree with subordinate and superordinate branches connected at nodes. For example, in

http://www.sunshinebakery.com/croissants/croissant/{croissant}

The root, /croissants, has a /croissant node beneath it. Underneath that there are a series of croissants names, such as chocolate-truffle, raspberry-jam, and so on, each of which points to a croissant type. Within this structure, it’s easy to get a croissant by typing the specific croissant name after /croissant/.

Additional URIs Guidelines

  • Hide the server-side scripting technology file extensions (.jsp, .php, .asp), if any, so you can port to something else without changing the URIs.
  • Keep everything lowercase.
  • Substitute spaces with hyphens or underscores (one or the other).
  • Avoid query strings as much as possible.
  • Provide a default page or resource as a response instead of using 404 Not Found code if the request URI is for a partial path.
  • URIs should also be static so that when the resource changes or the implementation of the service changes, the link stays the same. This allows bookmarking.
  • It’s also important that the relationship between resources that’s encoded in the URIs remains independent of the way the relationships are represented where they are stored.

XML or JSON Format

The representation of a resource reflects the current state of the resource and its attributes, at the time a client requests it. The representation is a snapshot in time.

Client and service exchange  a resource representation in the request/response payload or in the HTTP body using XML or JSON format. It is very important to keep things simple and human-readable. The following are some guidelines to keep in mind:

  • The objects in the data model are usually related and the relationships between data model objects (resources) should be reflected in the way they are represented for transfer to a client application.
  • The client applications should have the ability to request a specific content suited for them. So the service should use the built-in HTTP Accept header, where the value of the header is a MIME type.  This allows the service to be used by a variety of clients written in different languages running on different platforms and devices.

Using MIME types and the HTTP Accept header is a mechanism known as content negotiation. This lets clients choose which data format is right for them and minimizes data coupling between the service and applications.

RELATED ARTICLES

Representational State Transfer (REST)

The Web architecture has influenced the REST genesis and the way other kinds of distributed systems are created. In its essence the Web is based on these fundamental principles (architectural style):

REST Context

REST Context

Resource Representation

A resource is the basic building block of a distributed system (and the Web) and represents anything that a service can expose such as a document, a video or a business process. What is exposed is not the actual resource but its representation. This representation is encoded in one or more transferrable formats such as HTML, XML, JSON, plain text, JPEG and so on. A simple example is a Web page.

A service accesses a resource representation never the underlying resource. This separation allows loose coupling between a client application and the service and also allows scalability because a representation can be cached and replicated.

Each representation is a view of the same actual resource, with transfer formats negotiated at runtime through a content negotiation mechanism.

Resource State

A service progresses by transitioning from one state to another like in a state machine. The key difference is that in the service the possible states and the transitions amongst them are not known in advance. As the service gets to a new state, the next possible transitions are discovered.

For example, in an hypermedia system the states are defined by uniquely identifiable resources. The identifiers of the possible states to transition are contained in the current (state) representation as hyperlinks. Hence the name Representational State Transfer (REST).

Resource Address

A service can act on a resource (i.e., representation) through a very well defined set of verbs as provided by the HTTP protocol. For more information, see HTTP 1.1 rfc 2616. This set provides a uniform interface or a small number of verbs with well defined and widely accepted semantics to meet the requirements of a distributed system.

To act upon a resource the service must be able to identify it unequivocally. Tis is done through the Uniform Resource Identifier (URI).  A URI uniquely identifies a resource and makes it addressable or capable of being manipulated using a protocol such as HTTP.

A one to many relationship exists between a resource and URIs. A URI identifies only one resource, but a resource can have more than one URI.

A URI takes the following form:

http://scheme-specific_structure

For example

http://myserver/croissants/croissant/chocolate-truffle

establishes that the URI must be interpreted by the service according to the HTTP scheme. Notice that the previous URI does not specifies the resource format.

Best Practices

It is good practice not to specify the format in the URI by adding a suffix to the resource such as .html or .xml or .jsp. It is the responsibility of the service to provide the correct format as specified by the client in the Accept attribute of the HTTP request header (content negotiation). This allows loose coupling between the client and the service.

Service Maturity Level

Leonard Richardson created a maturity model to classify services as highlighted next.

  • Level 0. Services use a single URI (end point) to “tunnel” remote procedure invocation. Typically they use only one HTTP verb (POST or GET) to communicate with the end point. SOAP Web Services with XML Payload (POX) are an example at this level.
  • Level 1. Services use multiple URIs (end points) and a single HTTP verb (GET or POST).  A client uses multiple resources to perform different tasks.
  • Level 2. Services use multiple URIs (end points) and multiple HTTP verbs. The idea is to closely follow the semantics of the HTTP verbs according to specification. That is GET reads, POST creates, PUT updates and DELETE removes a resource similarly to the CRUD operations.
  • Level 3. Services provide links to other resources to the client. RESTful services in particular HATEOAS (Hypermedia As The Engine Of Application State) services operate at Level 3 maturity.

 

Putting to REST the Cloud Service APIs

I had to deal with the Representational state transfer (REST) API conundrum in the past. A lot of information obfuscated what in its essence is a very straightforward concept.

The REST API is not strictly speaking an “API”, if you refer to the HTTP protocol verbs. REST, in the words of its originator Roy Fielding, is an “architectural style”. Specifically, it is a set of guidelines that hinge on the HTTP protocol specifications. See http://www.w3.org/Protocols/.

Why should you care about this mambo jumbo? Well, believe it or not,this style is the corner stone of all the Cloud  (Web) services. At its core a Cloud service is stateless.  All the information about state is on the client side. For example, if you take a web page, once it has been delivered to the client, there is no memory on the server side keeping track of “what to do next”. What to do next, that is all the state information, is in the page itself in terms of hyperlinks. In essence, besides  its HTML content,  the hyperlinks in the page are the representational state transfer.

REST guidelines instruct the developers to use HTTP verbs explicitly and in a way that it is consistent with the protocol definition. This means a one-to-one mapping between create, read, update, and delete (CRUD) operations and HTTP verbs as follows:

  • Create a resource with POST
  • Retrieve a resource with GET
  • Update a resource with PUT
  • Delete a resource with DELETE

You can find more verbs here.

Why is REST Popular?

Because HTPP is ubiquitous, as a higher protocol in the application layer of the OSI model, the HTTP (REST) verbs have become the minimalist, effective and standard way to exchange information (text, images, videos and so on) on the Internet.

The following figure shows where the HTTP protocol is located in the OSI model:

OSI Model

Where is the API?

Well, any company that is in the web services business (shall we say Cloud), provides a set of libraries to support the most common programming languages (Java, Python, C#, etc…). A programmer can use a familiar programming language to perform requests associated with HTTP verbs. She must know a company specific library to make API calls. The library translates the calls into the appropriate HTTP verb requests. It also fills the HTTP protocol details such as header information.

The following is a logical diagram, albeit a simplified one, of the main components involved in a RESTful exchange:

REST API

The following code examples show how to list the objects in a bucket (stored in a public cloud storage “the service”):

  • The following code examples show how to list the objects in a bucket (stored in a public Cloud storage “the service”):

    Python

    def listObjects(bucketName, service):
      print 'List objects contained by the bucket "%s".' % bucketName
      fields_to_return = 'nextPageToken,items(bucket,name,metadata(my-key))'
      request = service.objects().list(bucket=bucketName, fields=fields_to_return)
      response = request.execute()
      print json_dumps(response, indent=2)
    

    Java

    Storage.Buckets.List listBuckets =
      service.buckets().list(settings.getProject());
    buckets = listBuckets.execute();
    for (Bucket bucket : buckets.getItems()) {
      displayMessageHeader("Getting bucket " + bucket.getName() + " metadata");
      displayBucketInformation(bucket);
    }
    

Under the hood, a GET request is issued. But you do not see it in the code snippet above.
What you see is an API call as highlighted.

An Introduction to Cloud Computing

Overview

The Cloud is a computing model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. For more information, see  The NIST Definition of Cloud Computing .

This is a technological breakthrough compared to the traditional approach where resources had to be allocated in advance with the danger of overestimating (or underestimating) the needs.

But, most importantly, in the cloud the allocation is done automatically and in real-time. This is the elasticity attribute of the cloud. The cloud main architectural principle is predicated on delivering IT services on demand. The result is software architectures with qualities such as: elasticityauto-scaling,  fault tolerance and administration automation.

An extension of this is the concept of application as a service usually, a REST web service.For more information about designing for the cloud, see  Cloud Ready Design Guidelines .

From a hardware point of view, three aspects are new in cloud computing:

  • The “infinite” computing resources available on demand, thereby eliminating the need for users to plan far ahead for provisioning
  • The elimination of an up-front commitment by the users, thereby allowing companies to start small and increase hardware resources only when there is an increase in their needs
  • The ability to pay for use of computing resources on a short-term basis as needed (e.g., processors by the hour and storage by the day) and release them as needed, thereby rewarding conservation by letting machines and storage go when they are no longer useful.

You may want to take a look at the following video to understand the difference between cloud and traditional virtualization: Cloud and Virtualization.

Cloud Deployment and Service Models

Deployment models define different types of ownership and distribution of the resources used to deliver cloud services to different customers.

Deployment Models

Cloud environments may be deployed over a private infrastructure, public infrastructure, or a combination of both.

The most common deployment models as defined by the National Institute of Standards and Technology (NIST) include the following:

  • Private cloud. The cloud infrastructure is operated solely for a single organization (client). It may be managed by the organization itself or a third-party provider, and may be on-premise or off-premise. However, it must be solely dedicated for the use of one entity.
  • Community cloud. The cloud infrastructure is shared by several organizations and supports a specific community with shared requirements or concerns (for example, business model, security requirements, policy, or compliance considerations). It may be managed by the organizations or a third party, and may be on-premise or off-premise.
  • Public cloud. The cloud infrastructure is made available to the general public or a large industry group and is owned by a cloud provider (an organization selling cloud services). Public cloud infrastructure exists on the premises of the cloud provider.
  • Hybrid cloud. The cloud infrastructure is a composition of two or more clouds (private, community, or public) that remain unique entities but are bound together by technology to enable portability. Hybrid clouds are often used for redundancy or load-balancing purposes. For example, applications within a private cloud could be configured to utilize computing resources from a public cloud as needed during peak capacity times.

Service Models

Service models identify different control options for the cloud client and cloud provider. For example, SaaS clients simply use the applications and services provided by the provider, where IaaS clients maintain control of their own environment hosted on the provider’s underlying infrastructure. The following are the most commonly used service models:

  1. Software as a Service (SaaS).  It   enables the end user to access applications that run in the cloud. The applications are accessible from various client devices through a thin interface such as a web browser. Some examples are:
    1. gmail
    2. Google docs
    3. Microsoft Office 360
  2. Platform as a Service (PaaS). It enables the deployment of applications in the cloud. These applications are created using programming languages and tools supported by the cloud provider. Some examples are:
    1. Google App Engine
    2. AWS Elastic Beanstalk
  3. Infrastructure as a Service (IaaS). It enables the provisioning of compute processing, storage, networks and other computing resources to deploy and run applications. You cannot control the underlying physical infrastructure though.Some examples are:
    1. Google App Engine
    2. Amazon S3
    3. Google Compute Engine
    4. Google Cloud Storage
    5. Google Big Query

The following picture depicts the service models and the way they stack up:

You can find the above picture and more information at NIST Cloud Computing Reference Architecture.The next picture shows the control and responsibilities for cloud clients and providers across the service models:

Cloud Logical Architecture

The cloud architecture is structured in layers. Each layer abstracts the one below it and exposes interfaces that layers above can build upon. The layers are loosely coupled and provide horizontal scalability (they can expand) if needed. As you can see in the next picture, the layers map to the service models described earlier.

As shown in the previous picture, the cloud architecture contains several layers, as described next.

  • Hosting Platform. Provides the physical, virtual and software components. These components include servers, operating system, network, storage devices and power control and virtualization software. All these resources are abstracted as virtual resources to the layer above.The virtual machine (VM) is at the core of the cloud virtualization. It represents a software implementation of a computing environment in which an operating system and other apps can run. The virtual machine typically emulates a physical computing environment, but requests for CPU, memory, hard disk, network and other hardware resources are managed by a virtualization layer which translates these requests to the underlying physical hardware.
    VMs are created within a virtualization layer, such as a hypervisor that runs on top of a client or server operating system. This operating system is known as the host OS. The virtualization layer can be used to create many individual, isolated VM environments.
  • Infrastructure Services. The important function of this layer is to abstract the hosting platform as a set of virtual resources and to manage them based on scalability and availability. The layer provides three types of abstract resources: compute, storage and network. It also exposes a set of APIs to access and manage these resources. This enables a user to gain access to the physical resources without knowing the details of the underlying hardware and software and to control them through configuration. Services provided by this layer are known as Infrastructure as a Service (IaaS).
  • Platform Services. Provides a set of services to help integrating on-premise software with services hosted in the cloud. Services provided by this layer are known as Platform as a Service (PaaS).
  • Applications. Contains applications built for cloud computing. They expose web interfaces and services and enable multitenant hosting. Services provided by this layer are known as Software as a Service (SaaS).

The vertical bars in the picture represent components that apply to all layers with different degrees of scope and depth. Mainly they support administrative functions, handling of security and cloud programmability (the later supporting the most common programming languages).