Close
  • Latest News
  • Artificial Intelligence
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Cloud
    • Cloud
    • Innovation
    • IT Management
    • Servers
    • Storage
    • Virtualization

    Deploying the Best of Both Worlds: Data Orchestration for Hybrid Cloud

    By
    Eric Kavanagh
    -
    June 26, 2020
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      Did someone tell your data to shelter in place? That wouldn’t make any sense, would it? Ironically, for vast troves of valuable enterprise data, that might as well be the case, because massive, compute-bound data silos are practically everywhere in the corporate world.

      Hadoop played a role in creating this scenario, because many large organizations sought to leverage the twin promises of low-cost storage and massively parallel computing for analytics. But a funny thing happened to the yellow elephant: It was largely obviated by cheap cloud storage.

      Seemingly overnight, the price of cloud storage dropped so precipitously that the cost-benefit analysis of using the Hadoop Distributed File System (HDFS) on-premises for new projects turned upside down. Even the term “Hadoop” disappeared from the names of major conferences.

      That’s not to say there isn’t valuable data in all those HDFS repositories, however. Many important initiatives used this technology in hopes of generating useful insights. But with budgets moving away from Hadoop, another strategy is required to be successful. 

      What about computing? Suffice to say, cloud providers now offer a robust service in this territory. And relatively recent innovations such as separating computing from storage have also played a part in paving the way for cloud-based computing to take on all manner of workloads.

      So the cloud now easily eclipses most on-premises environments in all the major categories: speed, cost, ease of use, maintenance, scalability. But there are barriers to entry; or at least pathways that should be navigated carefully while making the move. Mistakes can be very costly!

      But how do you get the data there? Amazon actually offers a “snow truck” that will come to your data center, load it up one forklift at a time, and haul it, old-school, to its facility. That approach can certainly work for a quick-and-relatively-dirty solution, but it ignores the magic of cloud.

      Seeing Triple

      As the concept of “cloud-native” gets hashed out on whiteboards in boardrooms around the business world, the reality taking shape is that a whole new generation of solutions is being born. These systems are augmented with high-powered analytics and artificial intelligence. 

      This new class of application is almost exclusively built on a microservices architecture with Kubernetes as the foundation. There is tremendous value to this approach, because scalability is built into its DNA. Taking advantage of this new approach requires a commitment to change.

      Simply shipping your data and applications en toto to a cloud provider absolutely does not solve this challenge. In fact, it will likely result in a significant rise in total cost of ownership (TCO), thus undermining a major driver for moving to the cloud.

      Another strategy involves porting specific data sets into the cloud to deploy the power of all that computation. This often involves making copies of the data. While change-data-capture can be used to keep these disparate environments in sync, there are downsides to this approach.

      In the first place, CDC solutions always need to be meticulously managed. Small amounts of data drift can quickly become larger problems. This is especially problematic when the derived analytics are used for mission-critical business decisions, or customer-experience initiatives.

      Secondly, by going down this road, organizations risk the proliferation of even more data silos–this time in the cloud. And while cloud storage is getting cheaper, the cost of egress can creep up and throw budgets sideways; this is not good in a post-COVID world.

      Remember, the standard redundancy of Hadoop was to have three copies of every datum, which is good for disaster recovery but rather taxing overall, both in terms of throughput and complexity. While moving into the new world of cloud computing, we should avoid old errors.

      Agile Defined

      A different approach to bridging the worlds of on-prem data centers and the growing variety of cloud computing services is offered by a company called Alluxio. From their roots at the Berkeley Amp Labs, they’ve been focused on solving this problem.

      Alluxio decided to bring the data to computing in a different way. Essentially, the technology provides an in-memory cache that nestles between cloud and on-prem environments. Think of it like a new spin on data virtualization, one that leverages an array of cloud-era advances.

      According to Alex Ma, director of solutions engineering at Alluxio: “We provide three key innovations around data: locality, accessibility and elasticity. This combination allows you to run hybrid cloud solutions where your data still lives in your data lake.”

      The key, he said, is that “you can burst to the cloud for scalable analytics and machine-learning workloads where the applications have seamless access to the data and can use it as if it were local–all without having to manually orchestrate the movement or copying of that data.”

      In this sense, Alluxio’s approach bridges the best of both worlds: You can preserve your investments in on-prem data lakes while opening a channel to high-powered analytics in the cloud, all without the encumbrance of moving massive amounts of data here or there. 

      “Data locality means bringing the data to compute, whether via Spark, Presto, or Tensorflow,” Ma said. In this scenario, Alluxio is installed alongside the compute framework and deploys unused resources on those servers to provide caching tiers for the data. 

      Options, Options

      There are various ways to get it done, depending upon the topology of the extant information architecture. In some environments, if Presto is using a lot of memory, Alluxio can allocate SSDs on the appropriate machines for optimized caching. 

      If you’re tying into HDFS, Presto can make the request, and Alluxio’s intelligent multi-tiering then uses whatever the most efficient approach might be–spanning memory, SSD or spinning disc. It all can be optimized as Alluxio monitors data access patterns over time. 

      Regardless of which tools an organization uses–Tensorflow, Presto, Spark, Hive–there will be different usage patterns across CPU, GPU, TPU and RAM. In the case of RAM and available disk types, Alluxio can work with whatever resources are available. 

      “Spark is less memory-intensive,” Ma said, “so we can allocate some memory. So you have choice to figure out what you want to allocate and where. Alluxio allows you to seamlessly access the data in the storage area, wherever it may be.” 

      There’s also the concept of a Unified Name Space. “What it allows you to do is have a storage configuration that’s centrally managed,” Ma said. “You’re not going into Spark and Presto to set it all up; you’re able to configure Alluxio once, and then Spark or Presto communicate to Alluxio.”

      The general idea is to create a high-speed repository of data that allows analysts to get the speed and accuracy they demand without giving into the temptation of data silos. Think of it as a very large stepping stone to the new normal of multi-cloud enterprise computing.

      “With Alluxio, we sit in the middle and offer interfaces on both sides; so we can talk to a variety of different storage layers,” Ma said. “We act as a bridging layer, so you can access any of these technologies.” In short, you can have your data cake and eat it too. 

      Like any quality abstraction layer, solving the data challenge in this manner enables companies to leverage their existing investments. Data centers will have a very long tail, and cloud services will continue to evolve and improve over time. Why not get the best of both worlds?

      Eric Kavanagh is CEO of The Bloor Group, a new-media analyst firm focused on enterprise technology. A career journalist with more than two decades of experience in print, broadcast and Internet media, he also hosts DM Radio, and the ongoing Webcast series for the Global Association of Risk Professionals (GARP).

      Eric Kavanagh
      Eric Kavanagh
      Kavanagh is CEO of The Bloor Group, and Host of DM Radio and InsideAnalysis, two nationally syndicated radio shows about data and the information economy.

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      10 Best Artificial Intelligence (AI) 3D Generators

      Aminu Abdullahi - November 17, 2023 0
      AI 3D Generators are powerful tools for creating 3D models and animations. Discover the 10 best AI 3D Generators for 2023 and explore their features.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Applications

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×