Monthly Archives: December 2013

Google Cloud Storage Python Applications Prerequisites

This post describes Google Cloud Storage python applications prerequisites. To this end we’ll use the Python programming language, MAC OS 10.x platform and Eclipse IDE. Later we’ll focus on other programming languages and/or platforms.

Install Python

Remember that Mac OSX 10.x platform has Python already installed.  You must install Python version 2.7.x instead, for example from here: http://www.python.org/download/releases/2.7.6/.

Apple uses its own version of Python and proprietary extensions to implement some of the software distributed as a part of Mac OS X. Unless you know what you are doing, do not mess with it.

After the installation, make sure that the environment variable PATH is properly set in the ~/.bash_profile as follows:

PATH="/Library/Frameworks/Python.framework/Versions/2.7/bin:${PATH}"

Install PyDev in Eclipse

PyDev is a Python IDE for Eclipse. Its latest version requires Java SDK version 7 or above. If you do not have this Java version, install it from here: Java SE Development Kit 7. Then follow these steps:

  1. Activate Eclipse.
  2. In the menu bar click Help->Install New Software.
  3. In the Available Software window click the Add button.
  4. In the Add Repository window click the Add button. Then perform these operations:
    • In the Name box enter PyDev.
    • In the Location box enter http://pydev.org/updates/.
    PyDev Repository

    PyDev Repository

  5. Click OK.
  6. In the combo box that is displayed, check the box by PyDev.
  7. Accept the default selection and click Next.
  8. Accept the terms and conditions, then click Finish.

This install the latest PyDev version.

Configure PyDev

If you have more than one Python version installed, you must configure PyDev to use Python 2.7.x interpreter as follows:

  1. In Eclipse menu bar, click Eclipse->Preferences.
  2. In the left pane of the Preferences window, expand the PyDev node.
  3. Expand the Interpreter node.
  4. Click Python Interpreter.
  5. In the right pane, click the Quick Auto-Config button. This allows to configure the default Python interpreter (i.e., Python 2.7). A selection window is displayed.
  6. Select python. In the lower section you will see the list of Python 2.7 libraries.
  7. Click OK

Create a Google Cloud Storage Project

Select or create a Cloud Storage Project as described here: How to activate Google Cloud Storage. For your convenience, the steps are also described next.

  1. If you do not have a Google account, create one. For more information, see  Create a Google Account.
  2. Activate the Google Cloud Console. Then perform the following steps:
    • If you already have a project, select it.
    • If you do not have a project, create one. Notice the project ID. You will use it often to perform the Google Cloud Storage operations.
  3. Enable Google Cloud Storage for the project as follows:
    • In the console left pane, expand the APIs & auth node and select APIs.
    • In the console right pane, turn the button by Google Cloud Storage from Off to On.
  4. Enable billing.
  5. In the left pane click Settings.
  6. In the right pane click Billing Account for [your project name] and perform these steps:
    • Set your billing profile.
    • Select your modality of payment.
    • Submit and activate the account.
    • Assure that your main e-mail is verified so you can receive billing information.

That’s it. Now you are ready to use the service.

Enabling billing does not necessarily mean that you will be charged. For more information, see Pricing and Terms.

Let’s verify that you can use the service with the gsutil tool. Assure that you have installed the tool first. For more information, see Install gsutil Tool.

  1. Open a Terminal window.
  2. At the command prompt enter: gsutil mb gs://<unique bucket name>. This creates a  bucket in your project using the default region. Notice that the bucket name must be unique in the entire Google Cloud Storage name space.
  3. To verify that the bucket has been created, at the command prompt enter: gsutil ls. The bucket name you just created will be listed.

For a complete list of gsutil commands and related syntax, see gsutil Tool.

Install gsutil Tool

To install the tool follow the steps described next. For more information, see Install gsutil.

  1. Download gsutil.tar.gz.
  2. Extract the archive in the directory of your choice as follows: tar xfz gsutil.tar.gz -C ~/myDir. If you do not specify the target directory the tool installs in your  ${HOME} directory.
  3. Add gsutil to your PATH environment variable. On MAC add the following to the ~/.bash_profile : PATH=${PATH}:~/myDir/gsutil.
  4. Restart the Terminal.
  5. At the command prompt enter gsutil. You should get the tool help.
  6. At the command prompt enter: gsutil config.  A link is displayed. This is to configure the tool with security information so it can access your project.
  7. Open a new browser session and go to the link obtained in the previous step.
  8. Click the Accept button. An access code is displayed.
  9. Copy the access code and enter it in the Terminal window.
  10. Enter your project ID. The gsutil creates the  .boto configuration file that contains information such as security data needed when performing Google Cloud Storage operations.

Build Google Cloud Storage Client Applications

You can build Google Cloud Storage client applications selecting one of the supported RESTful APIs. Google Cloud Storage (GCS) supports 2 kinds of APIs as described next.

XML API

The XML API is the first API created by the GCS development team. It uses the HTTP protocol with the payload in XML format.

XML API Context

XML API Context

This API is used by current and earlier applications mainly written in Python using the boto library and in Java using other libraries such as JetS3t.

The XML API v1.0 is interoperable with some cloud storage tools and libraries that work with services such as Amazon Simple Storage Service (Amazon S3) and Eucalyptus Systems, Inc.

The following Python code snippet shows how to list the buckets contained in a project using the boto library. In future posts, we’ll show you how to exercise other parts of the XML API using the same library.

def list_buckets(project_id, debug_level):
    '''
    Perform a GET Service operation to list the buckets 
    contained in the specified project.
    @param project_id: The id of the project that contains 
    the buckets to list.
    @param debug_level: The level of debug messages to be printed.
    '''
    try:
        # URI scheme for Google Cloud Storage.
        GOOGLE_STORAGE = "gs"

        # Define the project URI
        uri = boto.storage_uri("", GOOGLE_STORAGE, debug_level)
        
        # Define the header values.
        header_values = {"x-goog-api-version": "2",
                         "x-goog-project-id": str(project_id)}

        # List the buckets in the projects.
        for bucket in uri.get_all_buckets(headers=header_values):
            print bucket.name

    except boto.exception, e:
        logging.error("list_buckets, error occurred: %s", e)

For testing purposes, you can use XML API directly with the curl tool.

JSON API

The JSON API is the second API created by the GCS development team. It uses the HTTP protocol with the payload in JSON format. At the moment, this API is still in the experimental stage.

JSON API Context

JSON API Context

JSON format is poised to become the standard way to communicate with any Google cloud service. Even though the details may differ from one service to another, once you know how to use a certain API, you should be able to apply this knowledge anywhere else.

Examples of using JSON API can be shown from the browser. For example, if you have already a project you can list the buckets from this location:  Bucket:List.

The libraries support several programming languages and this allows for a wider range of applications, compared to XML API for example. For information about the supported languages, see Libraries.

Both XML and JSON API use the HTTP protocol as defined by the HTTP/1.1 specifications and provide a RESTful interface for accessing Google Cloud Storage to perform Create, Read, Update, Delete (CRUD) operations. While the first API uses XML format the second uses JSON format for the payload encoding.

Conclusions

No matter what format you use, you are not going to build your HTTP method calls from scratch. In theory you could get down to the metal and use the protocol directly.  However, instead of creating HTTP requests and parsing responses manually, you may want to use the Google APIs client libraries. 

You could use client libraries such as httplib2 library. But it is advisable to stay with the supported Google libraries. They provide better language integration, improved security, and support for making calls that require user authorization.