CI Tools and Best Practices in the Cloud

Continuous Integration

Subscribe to Continuous Integration: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get Continuous Integration: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Continuous Integration Authors: Liz McMillan, Pat Romanski, Jyoti Bansal, Automic Blog, Elizabeth White

Related Topics: Continuous Integration, DevOps Journal

Article

Causes of Downtime | @CloudExpo @PagerDuty #DevOps #APM #Monitoring

According to a roundup by Gartner, the average cost of downtime for an enterprise is $5,600 per minute

The Top Causes of Downtime
By Zachary Flower

According to a roundup by Gartner, the average cost of downtime for an enterprise is $5,600 per minute. While the data collected was from incredibly large companies, the cost of downtime for even small startups is no laughing matter.

Let's assume, for the sake of simplicity, that your core product is a web app that relies solely on organic sales, totaling $1 million in revenue a year. This amounts to about $2 in lost revenue per minute. This doesn't sound like too much in the grand scheme of things, but revenue is only a small part of your downtime costs. We also must consider wasted operating costs.

Employees' time and productivity, too, are wasted during downtime. If, for example, you pay $500,000 a year in employee costs, that's an additional $1 in lost revenue per minute. If you're keeping track, we're now at $3 in cost per minute.

That's $180 an hour. $4,320 a day.

downtime calculator example

Source: downtimecost.com

Adds up quickly, doesn't it? Now we've accounted for employee costs and lost revenue, but what about other wasted expenses? Every unused piece of your architecture results in additional losses during downtime. Unused servers and third-party services can simply sit around while your team is working on a fix, and the fix itself could result in necessary additional (and costly) resources.

Depending on how critical your product is to your customers' businesses, downtime could not only cost you money, but also your customers' trust. It's difficult to justify the cost of paying an unreliable vendor, so while one outage is easily survivable, the loss of faith in your product is compounded with every subsequent occurrence.

Causes + Solutions
Ultimately, by understanding the causes of outages, you can maximize your chances of preventing them. The causes can be boiled down to a few categories - human error, third-party service outage, or a highly unpredictable "black swan" occurrence.

Human Error

One of the most common causes of downtime that I've personally seen is human error. Regardless of if a developer committed broken code, or an administrator updated an untested package, when procedure isn't followed or an obscure system bug isn't accounted for, product uptime will suffer. Establishing a system of checks and balances within an organization is the best solution to this problem. Code reviews, unit tests, quality assurance, proper planning, and clear communication all go a long way in preventing downtime that is definitely avoidable.

Service Outages

Sometimes downtime isn't caused internally, however. From time to time, even cloud providers like Amazon AWS go down. There is very little an organization can do when this happens (at least not without a proper plan in place). To combat this, I'm a fan of Netflix's Chaos Monkey system. For the uninitiated, Chaos Monkey is a system whose sole job is to kill off random services within a product's architecture. This forces the system to be self-repairing, and trains the team to handle outages effectively when they really matter. PagerDuty conducts it's own Failure Fridays as well!

Alerting

While occasional downtime is completely unavoidable (even Facebook goes down from time to time), how you handle and prepare for it will determine just how much of an impact it will have on your organization. Because every minute of downtime means additional costs, establishing workflows to prevent or reduce the length of an outage is crucial. Solutions like PagerDuty accelerate real-time incident resolution by notifying and getting everyone on the same page as soon as possible, and providing a platform for surfacing context to fix the issue. By aggregating all your event data and optimizing communication, it becomes far easier to identify root cause of an outage, and resolve issues efficiently and accurately.

Communication

It's important to remember that improving communication externally is just as important as improving it internally. Communicating information about an outage to your customers early and clearly goes a long way to maintaining trust and credibility with them. Through the use of tools like StatusPage and StatusCast, as well as PagerDuty's Stakeholder Engagement, organizations can better orchestrate the real-time business and external-facing response, and use status pages to provide valuable transparency into the health of a product. Personally, I find nothing more distrustful than an organization that remains quiet through a crisis. Their silence feels like an attempt at hiding something.

On-Call Rotations

All of these solutions are great, but it's important to understand that an indispensable part of managing unexpected downtime is to make sure there are always people on hand to fix the issue. This can be easily accomplished by establishing an on-call rotation amongst your engineers. An effective on-call rotation is a minimal investment that can help increase product reliability as well as maintain accountability, better service delivery, and improved work-life balance for your team. Without an on-call rotation, every outage turns into an "all hands" event, which is disruptive to the personal lives of every employee. On the flip side, a clearly defined on-call schedule and escalation policies means that workloads are balanced, and there is always a dedicated subject matter expert that is ready to fix an issue or drive collaboration for resolution as needed.

In the end, the best way to plan for (and mitigate) downtime is to invest in your resources and your team. Not every solution mentioned here is right for every organization, but the cost of doing nothing is significantly higher than the cost of doing something. When you have an established process for handling outages, it won't matter if it was caused by a hacker or a power outage. You and your team will be prepared to handle it.

The post The Top Causes of Downtime appeared first on PagerDuty.

@DevOpsSummit at Cloud Expo taking place June 6-8, 2017, at Javits Center, New York City, and is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world.

DevOps at Cloud Expo / @ThingsExpo 2017 New York 
(June 6-8, 2017, Javits Center, Manhattan)

DevOps at Cloud Expo / @ThingsExpo 2017 Silicon Valley
(October 31 - November 2, 2017, Santa Clara Convention Center, CA)

Download Show Prospectus ▸ Here

The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential.

@DevOpsSummit will expand the DevOps community, enable a wide sharing of knowledge, and educate delegates and technology providers alike. Recent research has shown that DevOps dramatically reduces development time, the amount of enterprise IT professionals put out fires, and support time generally. Time spent on infrastructure development is significantly increased, and DevOps practitioners report more software releases and higher quality. Sponsors of @DevOpsSummit will benefit from unmatched branding, profile building and lead generation opportunities through:

  • Featured on-site presentation and ongoing on-demand webcast exposure to a captive audience of industry decision-makers.
  • Showcase exhibition during our new extended dedicated expo hours
  • Breakout Session Priority scheduling for Sponsors that have been guaranteed a 35-minute technical session
  • Online advertising in SYS-CON's i-Technology Publications
  • Capitalize on our Comprehensive Marketing efforts leading up to the show with print mailings, e-newsletters and extensive online media coverage.
  • Unprecedented PR Coverage: Editorial Coverage on DevOps Journal
  • Tweetup to over 75,000 plus followers
  • Press releases sent on major wire services to over 500 industry analysts.

For more information on sponsorship, exhibit, and keynote opportunities, contact Carmen Gonzalez by email at events (at) sys-con.com, or by phone 201 802-3021.

The World's Largest "Cloud Digital Transformation" Event

@CloudExpo / @ThingsExpo 2017 New York 
(June 6-8, 2017, Javits Center, Manhattan)

@CloudExpo / @ThingsExpo 2017 Silicon Valley
(Oct. 31 - Nov. 2, 2017, Santa Clara Convention Center, CA)

Full Conference Registration Gold Pass and Exhibit Hall ▸ Here

Register For @CloudExpo ▸ Here via EventBrite

Register For @ThingsExpo ▸ Here via EventBrite

Register For @DevOpsSummit ▸ Here via EventBrite

Sponsorship Opportunities

Sponsors of Cloud Expo @ThingsExpo will benefit from unmatched branding, profile building and lead generation opportunities through:

  • Featured on-site presentation and ongoing on-demand webcast exposure to a captive audience of industry decision-makers
  • Showcase exhibition during our new extended dedicated expo hours
  • Breakout Session Priority scheduling for Sponsors that have been guaranteed a 35 minute technical session
  • Online targeted advertising in SYS-CON's i-Technology Publications
  • Capitalize on our Comprehensive Marketing efforts leading up to the show with print mailings, e-newsletters and extensive online media coverage
  • Unprecedented Marketing Coverage: Editorial Coverage on ITweetup to over 100,000 plus followers, press releases sent on major wire services to over 500 industry analysts

For more information on sponsorship, exhibit, and keynote opportunities, contact Carmen Gonzalez (@GonzalezCarmen) today by email at events (at) sys-con.com, or by phone 201 802-3021.

Secrets of Sponsors and Exhibitors ▸ Here
Secrets of Cloud Expo Speakers ▸ Here

All major researchers estimate there will be tens of billions devices - computers, smartphones, tablets, and sensors - connected to the Internet by 2020. This number will continue to grow at a rapid pace for the next several decades.

With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend @CloudExpo@ThingsExpo, June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA. Learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.

Track 1. FinTech
Track 2. Enterprise Cloud | Digital Transformation
Track 3. DevOps, Containers & Microservices 
Track 4. Big Data | Analytics
Track 5. Industrial IoT
Track 6. IoT Dev & Deploy | Mobility
Track 7. APIs | Cloud Security
Track 8. AI | ML | DL | Cognitive Computing

Delegates to Cloud Expo @ThingsExpo will be able to attend 8 simultaneous, information-packed education tracks.

There are over 120 breakout sessions in all, with Keynotes, General Sessions, and Power Panels adding to three days of incredibly rich presentations and content.

Join Cloud Expo @ThingsExpo conference chair Roger Strukhoff (@IoT2040), June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA for three days of intense Enterprise Cloud and 'Digital Transformation' discussion and focus, including Big Data's indispensable role in IoT, Smart Grids and (IIoT) Industrial Internet of Things, Wearables and Consumer IoT, as well as (new) Digital Transformation in Vertical Markets.

Financial Technology - or FinTech - Is Now Part of the @CloudExpo Program!

Accordingly, attendees at the upcoming 20th Cloud Expo @ThingsExpo June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA will find fresh new content in a new track called FinTech, which will incorporate machine learning, artificial intelligence, deep learning, and blockchain into one track.

Financial enterprises in New York City, London, Singapore, and other world financial capitals are embracing a new generation of smart, automated FinTech that eliminates many cumbersome, slow, and expensive intermediate processes from their businesses.

FinTech brings efficiency as well as the ability to deliver new services and a much improved customer experience throughout the global financial services industry. FinTech is a natural fit with cloud computing, as new services are quickly developed, deployed, and scaled on public, private, and hybrid clouds.

More than US$20 billion in venture capital is being invested in FinTech this year. @CloudExpo is pleased to bring you the latest FinTech developments as an integral part of our program, starting at the 20th International Cloud Expo June 6-8, 2017 in New York City and October 31 - November 2, 2017 in Silicon Valley.

@CloudExpo is accepting submissions for this new track, so please visit www.CloudComputingExpo.com for the latest information.

Speaking Opportunities

The upcoming 20th International @CloudExpo@ThingsExpo, June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA announces that its Call For Papers for speaking opportunities is open.

Submit your speaking proposal today! ▸ Here

Our Top 100 Sponsors and the Leading "Digital Transformation" Companies

(ISC)2, 24Notion (Bronze Sponsor), 910Telecom, Accelertite (Gold Sponsor), Addteq, Adobe (Bronze Sponsor), Aeroybyte, Alert Logic, Anexia, AppNeta, Avere Systems, BMC Software (Silver Sponsor), Bsquare Corporation (Silver Sponsor), BZ Media (Media Sponsor), Catchpoint Systems (Silver Sponsor), CDS Global Cloud, Cemware, Chetu Inc., China Unicom, Cloud Raxak, CloudBerry (Media Sponsor), Cloudbric, Coalfire Systems, CollabNet, Inc. (Silver Sponsor), Column Technologies, Commvault (Bronze Sponsor), Connect2.me, ContentMX (Bronze Sponsor), CrowdReviews (Media Sponsor) CyberTrend (Media Sponsor), DataCenterDynamics (Media Sponsor), Delaplex, DICE (Bronze Sponsor), EastBanc Technologies, eCube Systems, Embotics, Enzu Inc., Ericsson (Gold Sponsor), FalconStor, Formation Data Systems, Fusion, Hanu Software, HGST, Inc. (Bronze Sponsor), Hitrons Solutions, IBM BlueBox, IBM Bluemix, IBM Cloud (Platinum Sponsor), IBM Cloud Data Services/Cloudant (Platinum Sponsor), IBM DevOps (Platinum Sponsor), iDevices, Industrial Internet of Things Consortium (Association Sponsor), Impinger Technologies, Interface Masters, Intel (Keynote Sponsor), Interoute (Bronze Sponsor), IQP Corporation, Isomorphic Software, Japan IoT Consortium, Kintone Corporation (Bronze Sponsor), LeaseWeb USA, LinearHub, MangoApps, MathFreeOn, Men & Mice, MobiDev, New Relic, Inc. (Bronze Sponsor), New York Times, Niagara Networks, Numerex, NVIDIA Corporation (AI Session Sponsor), Object Management Group (Association Sponsor), On The Avenue Marketing, Oracle MySQL, Peak10, Inc., Penta Security, Plasma Corporation, Pulzze Systems, Pythian (Bronze Sponsor), Cosmos, RackN, ReadyTalk (Silver Sponsor), Roma Software, Roundee.io, Secure Channels Inc., SD Times (Media Sponsor), SoftLayer (Platinum Sponsor), SoftNet Solutions, Solinea Inc., SpeedyCloud, SSLGURU LLC, StarNet, Stratoscale, Streamliner, SuperAdmins, TechTarget (Media Sponsor), TelecomReseller (Media Sponsor), Tintri (Welcome Reception Sponsor), TMCnet (Media Sponsor), Transparent Cloud Computing Consortium, Veeam, Venafi, Violin Memory, VAI Software, Zerto

About SYS-CON Media & Events
SYS-CON Media (www.sys-con.com) has since 1994 been connecting technology companies and customers through a comprehensive content stream - featuring over forty focused subject areas, from Cloud Computing to Web Security - interwoven with market-leading full-scale conferences produced by SYS-CON Events. The company's internationally recognized brands include among others Cloud Expo® (@CloudExpo), Big Data Expo® (@BigDataExpo), DevOps Summit (@DevOpsSummit), @ThingsExpo® (@ThingsExpo), Containers Expo (@ContainersExpo) and Microservices Expo (@MicroservicesE).

Cloud Expo®, Big Data Expo® and @ThingsExpo® are registered trademarks of Cloud Expo, Inc., a SYS-CON Events company.

More Stories By PagerDuty Blog

PagerDuty’s operations performance platform helps companies increase reliability. By connecting people, systems and data in a single view, PagerDuty delivers visibility and actionable intelligence across global operations for effective incident resolution management. PagerDuty has over 100 platform partners, and is trusted by Fortune 500 companies and startups alike, including Microsoft, National Instruments, Electronic Arts, Adobe, Rackspace, Etsy, Square and Github.