Serving Static Content From Azure Storage: Content Delivery Network Setup

As noted in a previous article of this series, a Standard Azure CDN by either Akamai or Verizon will generally serve as an excellent backbone for serving static HTTP content from Azure Storage.

You really only need the Verizon Premium tier if you have to apply CORS origin restrictions, or specific headers, to your content.

I believe the ability to quickly apply changes to CDN files far outweighs the value of prefetching files to the CDN, so I generally default to the Standard Akamai CDN versus the Standard Verizon CDN, which (at this writing) cost the same.

Prepare your files for CDN availability

Because a CDN is applied to an account, all public containers and public-blob containers in that account will be available through the CDN endpoint, and all blobs in each container will be available through that CDN.

So you might want to re-engineer how and where your files are stored.

For example, if you have page blobs (e.g., virtual hard drives) and block blobs (standard files) in the same publicly accessible container, you’ll probably want to move the page blobs into a private container.

I can’t think of any reason why you would need to serve a page blob over a CDN; even if there is a good reason, to me it still reeks of “really bad idea.”

Or, if you want certain blobs to not be available via CDN, you’ll want those to be in a private container.

Finally, you’ll need to have CacheControl headers on all the blobs you intend to serve via CDN, so that your CDN knows when to check for a new version.

Setting CacheControl headers

We can use .NET to loop through all the containers in an account, and all the block blobs in each of those containers, to set permissions.

In this sample code, I am going to set cache expiration to 1 hour (that’s 3,600 seconds):

using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Blob;
using System;

namespace AzureCDN
{
    class Program
    {
        static void Main()
        {
            var accountName = "account-name";
            var accountKey = "primary-storage-key";

            //set CacheControl to one hour expiration
            var cacheControl = "public, max-age=3600";

            try
            {
                Console.WriteLine($"Connecting to Azure Storage account {accountName} and creating client.");
                var account = CloudStorageAccount.Parse($"DefaultEndpointsProtocol=https;AccountName={accountName};AccountKey={accountKey}");
                var client = account.CreateCloudBlobClient();

                Console.WriteLine($"Iterating through all containers in account {accountName}");
                foreach (var container in client.ListContainers(null, ContainerListingDetails.All))
                {
                    Console.WriteLine($"Now inspecting container {container.Name}");
                    var permissions = container.GetPermissions();
                    if (permissions.PublicAccess == BlobContainerPublicAccessType.Blob || permissions.PublicAccess == BlobContainerPublicAccessType.Container)
                    {
                        Console.WriteLine($"Container {container.Name} is a public container or has public blobs, proceeding.");
                        //we want to get a flat listing of blobs, since we're going to iterate them all
                        foreach (var blobItem in container.ListBlobs(null, true))
                        {
                            //we only want to cache-control block blobs
                            if (blobItem is CloudBlockBlob)
                            {
                                //cast IListBlobItem to CloudBlockBlob
                                var blob = (CloudBlockBlob)blobItem;
                                if (blob.Exists())
                                {
                                    Console.WriteLine($"Blob {blob.Name} is a CloudBlockBlob and it exists. Setting CacheControl to {cacheControl}");
                                    blob.Properties.CacheControl = cacheControl;
                                    blob.SetProperties();
                                }
                            }
                        }
                    }
                }
            }
            catch(Exception ex)
            {
                Console.WriteLine($"Error encountered: {ex.Message}");
            }

            Console.Read();
        }
    }
}

This code as a GitHub Gist: https://gist.github.com/dougvdotcom/6eab7b2fcb97eba14b0ffc2e71f03880

Needless to say, if you’ve been stuffing lots and lots of files into your containers over a long time, it will take a while for this code to execute.

There are async methods (e.g., CloudBlobContainer.GetPermissionsAsync, CloudBlockBlob.SetPropertiesAsync) available. Because this demo code is a console app and the Main() of a console app generally can’t run async, for simplicity’s sake I didn’t write it that way.

Create your CDN

You’re now ready to create the CDN and specify your storage account as its source.

A blade comes up, asking for some settings.

The new Azure CDN blades in the Azure Portal.
The new Azure CDN blades in the Azure Portal.
  • For Name, enter the name you want assigned to your CDN in the portal. This is simply a friendly name that will appear in the portal to help you identify the CDN; it is not exposed to the public.
  • Select the subscription you want to use, if you have more than one subscription. Note that whatever storage account you want to use must be in the same subscription as the CDN.
  • I recommend putting the CDN into the same resource group as the storage account you want to use as your seed. You can have multiple endpoints per CDN, each representing a different storage account. So it may be that you want to create a CDN in a different resource group, or even its own resource group. However, as a rule I create a new CDN for every Storage account I plan to serve, and put them both in the same resource group, to ensure they remain connected for management purposes.
  • If you’re creating a resource group, choose a region for it. Note that for CDNs alone, the location of the resource group doesn’t really matter; the CDN itself doesn’t really live in Azure or in a single region. However, if you intend to couple location-dependent resources, such as storage accounts, in that new resource group, where you locate it does matter.
  • For pricing tier, select S2 Standard Akamai. (Or you can choose one of the Verizon offerings; but this tutorial is using Standard Akamai because it’s a fine choice for most uses and applies changes faster than Verizon.)

With your CDN created, you can now start assigning endpoints.

  • Locate your CDN in the Portal and click on its tile or entry to open its blade.
  • At the top of that blade, click the “+ Endpoint” button.

A new blade comes up, asking for configuration variables.

The add CDN endpoint blade in the Azure Portal.
The add CDN endpoint blade in the Azure Portal.
  • For name, provide the subdomain name you want assigned to your account.

For example, if you enter “myazurecdn” here, your files will be accessed through the domain myazurecdn.azureedge.net. (You can also assign a custom domain name to an Azure CDN, which is outside the scope of this tutorial.)

Important note: You can only have one storage account per endpoint / subdomain name.

That is, if you want to use two storage accounts in the same Azure CDN, each of those storage accounts must have its own endpoint / subdomain name.

So, if you have StorageAccountAlpha and StorageAccountBeta, you’ll need two endpoints, e.g., storageaccountalpha.azureedge.net and storageaccountbeta.azureedge.net.

  • Under Origin Type, select Storage.
  • Under Origin Hostname, select the Storage account you want to associate with this endpoint / subdomain.
  • You can optionally provide a custom path to these files, e.g. myazurecdn.azureedge.net/path/to/files. However, I wouldn’t do this; it’s basically an open invitation to creating 404 errors, and the SEO or other benefits of a custom path probably aren’t worth it.
  • Under Origin Host Header, leave your storage account as the value. Certain Azure services need this value to coincide with the storage account, and some external requests may not perform correctly if the value here doesn’t jive with the actual origin of the files in the CDN.
  • You can optionally enable or disable HTTP or HTTPS connectivity and change connection ports. I usually allow both HTTP and HTTPS connections, but you might want to disable HTTP access if your files will always appear on secure webpages, to avoid broken-lock icons.

Check your endpoint

As soon as the endpoint is created, you should be able to call files through it.

Accessing files in your CDN is basically going to follow the same pattern as calling files from Azure Storage directly: {endpoint}.azureedge.net/{container}/{filename} instead of {account}.blob.core.windows.net/{container}/{filename}.

Here’s a side-by-side comparison. On the left is a JPEG served directly from Azure Storage; on the right, the same file served via CDN.

https://mtmeditorial.blob.core.windows.net/frontpages/PPH300.jpg
https://mtmeditorial.blob.core.windows.net/frontpages/PPH300.jpg
https://i2.wp.com/mtmeditorial.azureedge.net/frontpages/PPH300.jpg?resize=300%2C524
http://mtmeditorial.azureedge.net/frontpages/PPH300.jpg

Purging the cache

Sometimes, your CDN will get out of sync with your Storage account. This is especially likely if you have long cache expiration times on your blobs.

To fix this, you can purge the CDN; either the entire contents or a specific file. This will make the CDN fetch a new copy of all purged files on that endpoint.

  • Log in to the Azure Portal.
  • Find your CDN and click its tile or listing to open its blade.
  • Click the Purge button at the top of the blade.
The puge cache blades in the Azure Portal.
The puge cache blades in the Azure Portal.
  • In the blade that comes up, select the endpoint that you want to purge.
  • If a lot of files are out of sync, it’s probably best to check the Purge All box.
  • Alternatively, if you only have one or two files out of sync, you can enter the relative path to the file you want purged into the Content path box (repeating the process for multiple files).
  • The Purge button becomes active as soon as you provide legitimate values, and pressing it commits the purge.
To prevent this rigmarole, set shorter CacheControl properties on files you know you’re going to update frequently.

Also, you should set cache expiration times based on the location of your audience, not the frequency of when files are changed.

For example, my employer publishes new newspapers every day. But I can’t put a picture of every day’s front pages on a 24-hour cache, because I need users worldwide (or, more accurately, from Western Europe to the Pacific Rim) to see the new front pages beginning at 4 a.m. Eastern time.

So I set the CacheControl for front-page images to 4 hours. That way, I know any CDN worldwide will probably hold the correct version of the front page as soon as it’s available, regardless of when it is accessed; or, at worst case, a visitor will see the correct page by 8 a.m. Eastern time, which is acceptable.

That’s it for this series. Got feedback? Leave a comment!

Featured photo by jarmoluk via Pixabay, in the public domain.
Featured photo by jarmoluk via Pixabay, in the public domain.

Leave a Reply

  • Check out the Commenting Guidelines before commenting, please!
  • Want to share code? Please put it into a GitHub Gist, CodePen or pastebin and link to that in your comment.
  • Just have a line or two of markup? Wrap them in an appropriate SyntaxHighlighter Evolved shortcode for your programming language, please!