Last week, Andy Singleton, founder and president of Assembla, wrote an article titled “Dark Side of the Cloud: Problems with Storage” that ran on Venturefizz.com, which I read with great interest.
Singleton made some valid points regarding public cloud services, speed to market and saving on capital expenditures; however, I believe he missed the mark when it comes to storage and networking.
First, a well-designed cloud environment would not have networking or storage issues since the performance and storage needs are also on a pay-as-you-go, pay-as-you-need basis in most true cloud computing platforms. The more you need the more you get.
In the article, Singleton said “Storage capacity almost doubles every year. Networking speed grows by a factor of ten about every 10 years – 100 times lower.” I believe he is being a bit dramatic, but has a point. Across Data Storage Corporation’s client base, we see an average growth rate of about 25% year over year. The reason for the lower growth rate is that our clients are using our compression and de-duplication technology. Without proper de-duping, average growth rates could be 50-80%.
I believe where Singleton commented on mesh networks and referred to the Internet as a “hub and spoke network”, he misspoke. The Internet is the ultimate Fully Meshed (Full Mesh) network, which was the purpose DARPA and MIT intended, as the Internet was also called ARPANET. The Internet was originally created to allow computers to share information through packet switching technology in a fully meshed network topology, enabling data packets to arrive at an endpoint by traveling through any and many paths, even if some paths were destroyed by a nuclear attack, for example.
Singleton does make a good point about improperly designed storage networks in big public clouds resulting in variable performance during high traffic periods. What he doesn’t mention is that these same issues can be found in private corporate data networks as well and are simply results of poorly engineered, non-scalable storage solutions.
Where Singleton discusses replication, I believe that, again, everything boils down to design. Organizations need to make sure they have active-active storage clusters. They also should ensure that their high capacity applications are using fiber channel SANs while their distributed applications are using intelligent NAS-based network designs and de-duplication platforms, like those from NetApp, BlueArc, Data Domain, etc.
As stated, I think Singleton makes some good points, but the challenges he mentions for hosting high throughput data in the cloud are easily addressed by properly designing a storage network that encompasses a distributed, dedicated and local storage architecture.