Rico Suter's blog.
 


  • Category: Object storage (binary, unstructured)
  • Docker image: Not available
  • Local test replacement: Local file system

Microsoft Azure Blob Storage is used to store chunks of binary data to access from cloud services and applications. The technology is mainly used as replacement for a shared file system which is not available in the cloud.

You can store static files that are frequently used by a website, such as images, CSS files, and PDF files or store uploaded files instead of putting them into a more expensive, structured storage.

When to use

  • When you need to store files but don’t have a shared file system (e.g. applications in the cloud)

Constraints

  • Blob storage does not provide querying capability; a blob can only be accessed by its path/key/name (partial access is possible)

Concepts

The following list shows the fundamental concepts of the Blob Storage technology:

  • Storage account:
    • The Azure resource of a Blob Storage is called storage account, provides a namespace in Azure and is required to manage and access containers and blobs.
  • Container:
    • Each instance may have multiple blob containers. Containers are like root directories where Blobs can be stored.
  • Blob (Binary Large Objects):
    • Binary data, e.g. file content
    • Each blob has a unique name within its container. This name may contain slashes so that the blobs can be displayed as if they were stored in an hierarchical directory structure.
    • There are three Blob types:
      • Block Blobs: Block blobs let you upload large blobs efficiently
      • Append Blobs: An append blob is comprised of blocks and is optimized for append operations
      • Page Blobs: Page blobs are a collection of 512-byte pages optimized for random read and write operations

Security

The authorization level can be defined per container:

  • Private: Only someone with the credentials for the storage account can access the container and the Blobs in it.
  • Blob: The container is private but the Blobs are publicly accessible
  • Container: The container is public and the Blobs are public: If someone knows the container name, they can access all of the Blobs, and they can get a list and iterate through them.

There are three ways to access stored blobs:

  • Direct access: If the blob is public, it can be accessed via its HTTP URL which is in the following form:
      https://mystoragename.blob.core.windows.net/mycontainername/myblobname
    
  • Pass-through to user (front-end proxy service): A cloud application with full access to the blob storage loads the blob data and passes it through to an authorized client. In an ASP.NET MVC appliation, you create an authorized controller action which has the storage credentials needed to access the file, and have the application require authentication before letting the requestor download the file.
  • Access with an SAS token
    • A service authenticates a client as needed and then generates an SAS (Signed Access Signature) token in the backend and provides it to the client.
    • This token can have an expiry date and is added to the private blob URL so that it can be accessed directly.

Data replication

To avoid data loss, you can choose between the following replication modes:

  • Locally redundant storage (LRS): This means three copies of your blobs are stored in a single facility in a single region
  • Zone redundant storage (ZRS): It replicates your data across 2 to 3 facilities, either within a single region or across two regions
  • Geo-Redundant Storage (GRS): This replicates your data three times in your chosen data center, and then replicates it three times in a secondary data center that is far away.
  • Read-Access Geo-Redundant Storage (RA-GRS): This is geo-redundant storage plus the ability to read the data in the secondary data center.

Usage

The common usage scenarios are:

  • Access over an HTTP RESTful API
  • Access using a client library, for example in .NET (see sample below)
  • Access with an Azure storage management application like Cerulean

The usage of the .NET client libary (Microsoft.Azure.Storage.Blob) is very simple:

var connectionString = "myconnectionstring";
var containerName = "mycontainername";
var blobName = "myblobname";

if (CloudStorageAccount.TryParse(connectionString, out var storageAccount))
{
    CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
    CloudBlobContainer blobContainer = blobClient.GetContainerReference(containerName);

    // Create the container if it does not exist yet
    if (await blobContainer.ExistsAsync() == false)
    {
        await blobContainer.CreateAsync();
    }

    // Retrieve blob reference
    CloudBlockBlob blob = blobContainer.GetBlockBlobReference(blobName);

    // Read blob to a string
    using (var stream = new MemoryStream())
    {
        // Here you can also directly stream into the 
        // HTTP response stream to avoid memory allocations: 
        // await blob.DownloadToStreamAsync(HttpContext.Response.Body);        

        await blob.DownloadToStreamAsync(stream); 
        
        stream.Position = 0;
        using (var streamReader = new StreamReader(stream))
        {
            var text = await streamReader.ReadToEndAsync();
            // TODO: Use text
        }
    }
}

Pricing

The pricing is calculated from the following factors:

  • The amount of space the blobs take up
  • The number of operations performed
  • The amount of data transferred
  • The selected data redundancy option

More information

Alternatives



Discussion