DevOps.com
Home » Blogs »
By: on August 30, 2022 Leave a Comment
TidalScale has updated its namesake software for automating server management to include the ability to detect and prevent impending failures and enable IT teams to hot-swap another server.
Version 4.1 of TidalScale also adds server wear leveling and preventative health checks that identify servers that should be replaced based on their remaining estimated life span.
TidalScale CEO Gary Smerdon said that capability is especially critical given the high rate of DRAM memory failure that results in a high rate of downtime caused by server failures. The latest update to the company’s software-defined server management software reduces the failure rate in on-premises or cloud computing environments by a factor of 100 to ensure 99.999% uptime, he said.
The company also announced that its software is now available on the Amazon Web Services (AWS) Marketplace.
TidalScale employs machine learning algorithms to aggregate and virtualize CPU cores, memory and I/O across servers to present operating systems with a single software-defined server. That makes it possible to hot-swap individual servers to ensure availability.
The single biggest cause of downtime in IT environments is memory failures; they are now 100 times more likely than they were a decade ago, noted Smerdon. In most cases, the root cause of the issues is the demands of modern applications running on legacy infrastructure that was not intended to support the latency requirements of those applications, he added.
That issue can be especially acute in cloud computing environments where infrastructure is often shared by multiple applications, said Smerdon.
It’s not clear at what rate cloud service providers are replacing legacy server infrastructure, but the ability to determine when to replace infrastructure gives IT teams an added measure of control over servers regardless of where they are deployed. The biggest issue, of course, is not so much replacing a server as it is determining which one actually failed. It often takes longer to identify a failed server than it does to bring a new one online.
TidalScale is made available by both cloud service providers and manufacturers of servers used in on-premises IT environments, but it needs to be configured and deployed by an IT team. The overall goal is to not only increase system availability but also reduce the total cost of IT by managing infrastructure more efficiently across the life span of a server, said Smerdon. Some IT teams routinely replace servers at specified intervals, regardless of actual usage, to avoid downtime. TidalScale enables IT teams to determine which servers to replace based on the actual wear within the context of a self-healing server farm, he noted.
Given all the things that might potentially go wrong in an application environment, not having to worry about server failures should enable DevOps teams to spend more time optimizing application performance.
In theory, of course, modern applications based on microservices should be able to reroute calls to application programming interfaces (APIs) anytime a server becomes unavailable. However, over time it’s not uncommon for single points of failure to emerge as applications are updated. In the meantime, the majority of applications running today are based on legacy monolithic architectures that are much more susceptible to server failures that, unfortunately, are still far too common.
Filed Under: Application Performance Management/Monitoring, Blogs, DevOps in the Cloud, DevOps Practice, Enterprise DevOps, Features, Infrastructure/Networking
© 2022 ·Techstrong Group, Inc.All rights reserved.