What does it take to develop 'cloud-first'?





In this blog I try to outline some of my learnings building a managed cloud service ‘cloud-first’ and wanted to reflect and contrast against the traditional on-premise development model. The views in this article are my own, so feedback, comments are welcome



On-Premise/Traditional Product Development

Traditional on-prem software development often revolved on the core premises of Develop-Test-Deliver. Serviceability (read as trace-ability, logging) although critical in any enterprise class product, is often refined re-actively based on the nature of issues encountered from the field. With the ability to ‘later’ release new versions of the product, it is considered acceptable to have this improved upon in future increments/versions. Automated Validation, Continuous Integration although critical, is often looked upon as ‘process optimization’ and ‘good to have’ towards improving product quality and development efficiency. Since on-premise software release cycles typically spans several weeks, it is common to see this being compromised in favor of packing more product feature/functionality.




Now lets contrast it with the 'Cloud-First' approach:

One of the most prominent traits of a public managed cloud service is the need for extreme agility to update/patch and deliver fixes onto the existing service. Unlike traditional on-premise enterprise products, once the first increment of the product is available on cloud, the development/delivery processes need to be able to cater to daily fixes (if need be, atleast that should be the goal)



This calls for the need to start with a solid plan around ‘Continuous Integration & Deployment Pipeline’. It is critical to invest in establishing an automated delivery/deployment pipeline that ties in the entire process from the point a developer checks-in code, followed with automated unit & functional test runs, followed by deployment and integration test run, finally tagging the builds for ‘readiness’ to be deployed to Ops -> Staging -> Production environments. The need for a seamless, robust automated test and CI/CD pipeline is a critical per-requisite before the first delivery/go-live of the service on cloud.

Some tools for automated validation & CI/CD pipeline include: BDD (Behave), Jenkins, Docker Container Tech, Urban CodeDeploy, Ansible, Chef.


Unlike on-premise products, for a cloud service the expectation is to be able to ensure almost zero down times. At the least, the expectation is to be able to monitor and be alerted of any service downtimes/failure. The additional requirement is also to be able to have comprehensive logging coverage (for audit, security and troubleshooting purposes). In this direction, Serviceability vis-à-vis Alerting, Logging, Monitoring is another critical must-have foundational component that needs to be established very early in the cloud service development process.

Some tools & tech worth exploring for this include: ELK, QRadar, Collectd, Graphite.



Finally while Develop-Test-Deliver is underway, it is important to clearly define the Metering model on how the service usage needs to be measured and billed such as for a cluster you might have ‘node-hours’ as the unit of measurement.

Comments

Popular posts from this blog

Can IBM Watson gain wisdom?