{{page}}
Peak Signal
Cloud solutions are not new to the media industry, although some workflows have lent themselves to early adoption, while others are just becoming commercially or technically viable. Wherever you are in your cloud adoption journey, you have undoubtedly participated in a cloud cost modelling exercise and observed there is often a moment of collective (sometimes unspoken) nervousness – I think we may have forgotten something - but I'm not sure what… How on earth are we going to keep track of this ... While I am a keen proponent of using cloud solutions to solve media workflow challenges, I am equally as focussed on managing the costs of cloud-based products. Whether you are a content owner, managed service provider or technology vendor - managing cloud costs is now a way of life and removing the fear factor of scary and unexpected bills is a necessity. The approach I take to alleviate the nervousness around cloud costs is to use my (evolving) checklist. It has been created from aide memoire, scribbles, sent-emails and recurring conversations - it is a kind of "pre-flight checklist" for cloud cost models. It aims to help set you up to describe and prevent runaway cloud costs. You will not discover amazing revelations in this checklist, and certainly no architectural recommendations (though I have strong opinions on these!) and it is definitely not exhaustive; however, I hope it can be used as a useful reference or place to start. Please do have a scan and let me know if you have anything to add – I’d love to expand the checklist! Happy Halloween!
I think we may have forgotten something - but I'm not sure what…
SEGMENT AND ATTRIBUTE COSTS
Ensure that the billing boundaries and cost centres that your organization requires are represented in the design process at the outset of your project. Billing boundaries and segments; It is possible to simplify the ongoing attribution of costs to products, customers or business units with a considered structure of cloud platform organizations, organizational units and accounts. Tagging resources; the tagging and labeling of resources enables all sorts of technical and functional segmentation, though it is also fundamental to reporting on costs. Correctly tagged resources can help the product owner quickly and efficiently attribute costs to specific projects, users and customers; especially important in multi tenant scenarios with variable workloads.
MONITOR AND ALERT
Features and their cost; the availability of cloud products and their features (resources) does differ between regions, as does their cost. Ensure that your cost model is capable of differentiating these costs and if not, err on the side of caution. Interfacing between regions and zones; design choices that define the resilience and security of cloud and hybrid architectures will have cost implications, so ensure that your cost model addresses the costs for transfer of data (and resources this transfer requires) between regions and zones in your solution. While we are here, be mindful of regulatory and legal implications for cross-border data transfer.
Build cost monitoring into operations; things can change over time (a few more development instances here, a bit more storage utilization there) and your product can quickly find itself on an unbudgeted foundation. Monitoring cloud costs is as much of an operations responsibility as it is commercial responsibility, so ensure that all stakeholders understand how the cost model was built are able to access and review cloud costs review costs together regularly - to spot trend changes and to revalidate the model are aligned on what is and is not an acceptable cost Exception alerts; all major cloud providers provide the ability to reflect on costs over time and set billing alerts upon exceptions. Ensure that exception alerts are configured based on predicted utilization and make sure that all stakeholders are subscribed to cost notification topics, and that there is a clear owner that will take action for any exception.
The cost and benefits of automation; whilst not specifically a cloud challenge, the benefits of using cloud platforms mean that additional quality, velocity and cost efficiencies may be realised through the automation of resource configuration and usage. You should absolutely ask “can we automate this?” however you should also ask “should we automate this?” Has the impact of the automation been quantified, is it represented in the model? Is automation more or less likely to observe qualitative errors and impact SLAs? What is the effect of this automation on the operating model? Something you need to do once in a blue moon is likely not worth the expense of automating. It is said that there is a relevant XKCD for everything, so I’ll leave this here Define scaling constraints; whilst the technology team are able to tweak the characteristics of scaling behaviours, the implications must be incorporated into the model, What is the default scale? Has this been incorporated in the model? What scaling assumptions are built into the model? How are we measuring scaling (and its cost) in operation? Have we configured a maximum scale in the technology solution? Is the cost for increased scale proportional to the value it provides to the business?
REGIONAL PARITY
Peak Signal regularly helps customers with cloud cost modelling and improvements Read our recent projects
THE ART OF AUTOMATION AND SCALE
CHECKLIST
WE ALL GO A LITTLE MAD SOMETIMES
Okay, this one isn’t really a single checklist item - it is a few notes on some of the cloud cost components that I’ve seen missed, or misrepresented in cost models and operations Development and test; your teams must be able to innovate and they will need resources to do that. Does your cost model includes suitable dev and test resources? Support; are you making assumptions about support costs being provided by the cloud platform/s you are utilizing? Which accounts and resources are covered? Price changes of cloud and third-party resources; many cloud products have become more cost-effective over time, some have not, and the cost models of proprietrary components (such as those in the AWS Marketplace) whilst controlled, can change over time. Is your cost model being reviewed regularly to ensure price changes are captured? Pricing thresholds; does your cost model include assumptions about which pricing bracket your solution uses for cloud products? Is your cost model being reviewed regularly to confirm whether the brackets (or your place within them) has changed? The comet tail effect; beware of incremental small value cost items that accrue in your cloud solutions. Over time, these can become considerable. Are you reviewing them regularly to ensure they cannot be consolidated? Monitoring; do you have monitoring requirements that extend beyond your cloud environment (such as to on-premise teams, or to partner environments) and if so, has the cost of monitoring (data egress, health check monitoring points) been captured in your cost model?
Recovering from a platform failure may be very, very costly. Your team will have to spend time and effort applying a remedy, you may be required to pay your customer service credits, or you may suffer costly reputational damage. That being said, the tenet that not every increase in availability is worth the investment is a sound one. The needs of the business must be leading, most bells-and-whistles have a price tag - take advantage of the right options at a sustainable price point for your business. Google's book on Site Reliability Engineering describes their approach for embracing risk and for calculating whether investments in decreasing risk are cost-effective - it is worth reviewing.
EMBRACING RISK
To receive the latest industry articles and white papers Sign up for the Peak Signal mailing list