Home / Archive by category "Cloud computing"

Notes on Migrating a ASP.NET MVC + SQLExpress web site to Windows Azure

I recently completed a migration of a typical MVC/SQLExpress web site to Windows Azure. It was mostly painless, but there were a few hiccups along the way:

When migrating the database from SQL Express to SQL Azure, some data types had to be changed. All VARCHAR fields must be switched to NVARCAHR, and all TEXT fields must be changed to NVARCHAR(MAX). SQL Azure is unicode through and through, and I believe TEXT fields are deprecated, so this is understandable.

It also appears that spatial/geometry fields can’t be migrated at all through SSIS. Fortunately the open source migration wizard does support migrating these fields, so ultimately I resorting to using that.

Finally, once the web site was converted to an azure project, and the web site was published, I noticed about half my images and my fonts weren’t loading. It turns out that IIS in Azure has different static file mappings than are set on Windows 2008, so it was refusing to serve some file types. Fortunately this can be overriden in web.config:

 <mimeMap fileExtension=".svg" mimeType="image/svg+xml" />
 <mimeMap fileExtension=".woff" mimeType="application/font-woff" />

Once that was updated and re-published to Azure, all content was served successfully.

Optimize AWS SImpleDB Deletes with BatchDeleteAttributes

I’ve found that the pricing model for SimpleDB can be somewhat complex. EC2 is easy. The longer you leave your machine up, the longer it costs. However, for SimpleDB there is no single machine for your databases. Each request to SimpleDB takes up a certain number of CPU cycles, and at the end of the month those cycles are added up, translated into a number of hours used, and then translated into a bill.

Amazon SimpleDB measures the machine utilization of each request and charges based on the amount of machine capacity used to complete the particular request (SELECT, GET, PUT, etc.), normalized to the hourly capacity of a circa 2007 1.7 GHz Xeon processor.

It’s easy to check up on the machine utilization for your SimpleDB account. Log in to AWS, go to Account, AccountActivity, and you can download an XML file or CSV file of the current month’s usage. This report will list all requests, along with the usage for each one:

	<StartTime>07/14/11 18:00:00</StartTime>
	<EndTime>07/14/11 19:00:00</EndTime>

On a recent project we saw a spike in SimpleDB costs after about a month of usage. The app was using SimpleDB to store some logging and transaction information, and after a month it was deemed safe to delete this. However, each of these records was deleted with a single requests – which adds up if you’re deleting hundreds at a time. BatcheDelete lets you delete up to 25 per request – not perfect, but at least it’s better than one at a time. The AWS C# library supports this request:

var client = AWSClientFactory.CreateAmazonSimpleDBClient(ID, KEY);
BatchDeleteAttributesRequest deleteRequest = new BatchDeleteAttributesRequest()
deleteRequest.Item = new List();
foreach (var r in recordIDs)
deleteRequest.Item.Add(new DeleteableItem() { ItemName = r });

SimpleDB also has a BatchPut request, helping you to group INSERTs.

Free Programs for new Businesses to get into Cloud Computing

The fierce competition in the cloud marketplace today has resulted in some great deals for small business. Both Amazon and Microsoft currently have programs that offer a free tier of all their major cloud offerings for new accounts or new businesses. Amazon’s is very simple – a free tier of service is offered for the first 12 months once you sign up. There is no need for an application process and it’s open to individuals. To sign up, just go to http://aws.amazon.com/free/. The restrictions are as follows:

AWS Free Usage Tier (Per Month):

  • 750 hours of Amazon EC2 Linux Micro Instance usage (613 MB of memory and 32-bit and 64-bit platform support) – enough hours to run continuously each month
  • 750 hours of an Elastic Load Balancer plus 15 GB data processing
  • 10 GB of Amazon Elastic Block Storage, plus 1 million I/Os, 1 GB of snapshot storage, 10,000 snapshot Get Requests and 1,000 snapshot Put Requests
  • 5 GB of Amazon S3 storage, 20,000 Get Requests, and 2,000 Put Requests
  • 30 GB per of internet data transfer (15 GB of data transfer “in” and 15 GB of data transfer “out” across all services except Amazon CloudFront)
  • 25 Amazon SimpleDB Machine Hours and 1 GB of Storage
  • 100,000 Requests of Amazon Simple Queue Service
  • 100,000 Requests, 100,000 HTTP notifications and 1,000 email notifications for Amazon Simple Notification Service
This should be more than enough for a basic website with typical database needs.
Google’s AppEngine continues to havea  free usage tier. As of this writing it is 500MB of storage and up to 5 million page views a month, but Google is making changes with the introuction of “Apps for Business” so it’s best to check directly for updates.
Microsoft’s has several programs for trying out its service, but if you’re a small starting business you MUST try to join BizSpark. In addition to the networking and visibility benefits, you get a  full MSDN subscription and the following impressive package of Windows Azure services:
  • Windows Azure Small compute instance 750 hours / month
  • Windows Azure Storage 10 GB
  • Windows Azure Transactions 1,000,000 / month
  • AppFabric Service Bus Connections 5 / month
  • AppFabric Access Control Transactions 1,000,000 / month
  • SQL Azure Web Edition databases (1GB) 3
  • SQL Azure Data Transfers 7 GB in / month, 14 GB out / month

Reducing your Amazon S3 costs…. with a catch

Amazon just recently announced a “Reduced Redundancy Storage” option for S3 objects. In short, you can slash the costs of S3 storage by 33% by accepting a slightly greater chance of losing your data. So ask yourself…

Do I feel lucky? Well, do ya, punk?

In truth, the costs of any data loss in Amazon S3 are minuscule, under both the traditional model and under RRS. If you use S3, I highly recommend starting with the Vogels’ article on RRS and durability.

The same goes for durability; core to the design of S3 is that we go to great lengths to never, ever lose a single bit. We use several techniques to ensure the durability of the data our customers trust us with, and some of those (e.g. replication across multiple devices and facilities) overlap with those we use for providing high-availability. One of the things that S3 is really good at is deciding what action to take when failure happens, how to re-replicate and re-distribute such that we can continue to provide the availability and durability the customers of the service have come to expect. These techniques allow us to design our service for 99.999999999% durability.

Under RRS, instead of 99.999999999% durability, your object is only stored in such a way that is will survive a single data loss, or 99.99% durability:

We can now offer these customers the option to use Amazon S3 Reduced Redundancy Storage (RRS), which provides 99.99% durability at significantly lower cost. This durability is still much better than that of a typical storage system as we still use some forms of replication and other techniques to maintain a level of redundancy. Amazon S3 is designed to sustain the concurrent loss of data in two facilities, while the RRS storage option is designed to sustain the loss of data in a single facility. Because RRS is redundant across facilities, it is highly available and backed by the Amazon S3 Service Level Agreement.

Yes, it’s still covered by the SLA! Finally, to summarize the real risk in terms your manager can undterstand, take this from the RRS announcement on the AWS blog:

The new REDUCED_REDUNDANCY storage class activates a new feature known as Reduced Redundancy Storage, or RRS. Objects stored using RRS have a durability of 99.99%, or four 9’s. If you store 10,000 objects with us, on average we may lose one of them every year. RRS is designed to sustain the loss of data in a single facility.

I suspect that for most business applications 99.99% durability is “good enough” and a 33% savings cost is an great trade-off.

Finally, for my fellow .NET developers… Amazon did update their .NET SDK with this announcement. Be sure to download the latest version.

Getting Started with SQL Azure

After getting an account, you can log in at http://sql.azure.com.

They provide a very basic web interface lets you set up the firewalls or create new databases, but that’s about it. To do anything interesting you have to connect via code or the management studio.

The most recent version of MS SQL Server Management Studio (2008 R2) supports connecting to Azure. It’s possible to get earlier versions to connect, but 2008 R2 also ships with an Import/Export wizard that is supposed to support migrating data to Azure, but I have had little success with that. The open source Azure Migration Wizard has been far more reliable at moving data and informing you of any issues you’ll have migrating to the cloud.

When you connect via management studio the standard “object browser” does not work, but you can connect via a new query window:

Then specify the connection parameters, and under “options” select the database you want to connect to:

The first time you’ll attempt to connect chances are you’ll get an access denied error. SQL Azure’s firewall defaults to blocking all incoming traffic, so before you connect you have to open access to your current IP address, or the range of IP’s for your location. This is easy enough to do from your account at http://sql.azure.com. If you still cannot connect check your local firewall and ensure that the TCP 1433 is not blocked for outgoing connections (this is the port used by Azure).

Once connected you have a standard query window in Management Studio, and you can perform virtually any T-SQL function. With a few restrictions, Azure is a standard SQL server database, and very simple to work with.

AWS Announces Spot Instances: Market-Priced Cloud Computing

AWS recently announced a new service: Spot Instances.

Today we launched a new option for acquiring Amazon EC2 Compute resources: Spot Instances. Using this option, customers bid any price they like on unused Amazon EC2 capacity and run those instances for as long their bid exceeds the current “Spot Price.” Spot Instances are ideal for tasks that can be flexible as to when they start and stop. This gives our customers an exciting new approach to IT cost management.

The central concept in this new option is that of the Spot Price, which we determine based on current supply and demand and will fluctuate periodically. If the maximum price a customer has bid exceeds the current Spot Price then their instances will be run, priced at the current Spot Price. If the Spot Price rises above the customer’s bid, their instances will be terminated and restarted (if the customer wants it restarted at all) when the Spot Price falls below the customer’s bid. This gives customers exact control over the maximum cost they are incurring for their workloads, and often will provide them with substantial savings. It is important to note that customers will pay only the existing Spot Price; the maximum price just specifies how much a customer is willing to pay for capacity as the Spot Price changes.

Interestingly, this isn’t a technological innovation but is a major business innovation. The instances they are offering are the same instances offered in the tried and true AWS EC2 system. However, now they can offer these instances at a (presumed) lower price with the caveat that you may lose your instance if the market price for that compute power goes above what you are willing to pay for it.

What strikes me about this is the amazing efficiency of the system. Amazon could (in theory) rent out 100% of their aalable computing power through the EC2/spot instance system. If Amazon needs the computer power back, such as during the Christmas shopping season, they can raise the spot price and reclaim many of the resources. If a third party needs more compute power than is available, they increase their bid and drive up the price.

It should be interesting to see applications built around this model. Protein folding is the obvios example, but I can also see this as very useful for graphics rendering or even mundane tasks such as sending out newsletters.