Eric Kavanagh, Author at eWEEK

Navigating the Perfect Storm with Applied Intelligence

Eric Kavanagh — Wed, 21 Jun 2023 21:21:26 +0000

With budgets now tightening across corporate America, and the era of easy money a fast-fading memory, the time is nigh for achieving a long-sought goal in the world of business intelligence and analytics: closing the loop.

As far back as 2001, at data warehousing firms like my old haunt of Daman Consulting, we touted the value of “operationalizing” business intelligence. The idea was to leverage BI-derived insights within operational systems dynamically, and thus directly improve performance.

Though embedded analytics have been around for decades, it’s fair to say that most BI solutions in this millennium have focused on the dashboard paradigm: delivering high-level visual insights to executives via data warehousing, to facilitate informed decision-making.

But humans are slow, much slower than an AI algorithm in the cloud. In the time it takes for a seasoned professional to make one decision, AI can ask thousands of questions, get just as many answers, and then winnow them down to an array of targeted, executed optimizations.

That’s the domain of applied intelligence, a closed-loop approach to traditional data analytics. The goal is to fuse several key capabilities – data ingest, management, enrichment, analysis and decisioning – into one marshaling area for designing and deploying algorithms.

There are many benefits to this approach: transparency, efficiency, accountability; and most importantly in today’s market? Agility. During times of great disruption, organizations must have the ability to pivot quickly. And when those decisions are baked in via automation? All the better.

It also helps in the crucial domain of explainability, the capacity to articulate how an artificial intelligence model came to its conclusion. How explainable is a particular decision to grant a mortgage loan? How repeatable? What are the biases inherent in the models, in the data? Is the decision defensible?

On a related topic: The AI Market: An Overview

Take It To the Bank

The rise of fintech startups and neobanks, coupled with rapidly changing interest rates, has put tremendous pressure on traditional financial market leaders to innovate rapidly but safely. Rather than embrace a rear-guard strategy, many firms are looking to AI to regain momentum.

As CPTO for FICO, Bill Waid has overseen a wide range of banking innovations. UBS reduced card fraud by 74%, while Mastercard optimized fraud detection in several key ways, including automated messaging to solve the omni-channel conundrum of communications.

The Mastercard story demonstrates how a large financial institution is now able to dynamically identify, monitor, and manage client interactions across a whole host of channels – and fast enough to prevent money loss. A nice side benefit? Less-annoyed customers.

In a recent radio interview, Waid explained another situation where collaboration improves marketing. “In banking, from a risk perspective, one of the most profitable products is credit card. So if you were to ask somebody from risk: which would you push, it would be the credit card.”

But other departments may disagree. “If you ask the marketing person, they have all the stats and the numbers about the uptake, and they might tell you no, it’s not the credit card, at least not for this group (of customers), because they’re actually looking for a HELOC or an auto loan.”

The point is that you can drive away business by making the wrong suggestion. Without collaborating around common capabilities from a centralized platform, Waid says, that mistake would have likely gone into production, hurting customer loyalty and revenue.

With an applied intelligence platform, he says, key stakeholders from across the business all have their fingers in the pie. This helps ensure continuity and engagement, while also providing a shared baseline for efficacy and accountability.

Think of it as a human operating system for enterprise intelligence, one that’s connected to corporate data, predictive models, and decision workflows, thus achieving cohesion for key operational systems. In the ideal scenario, it’s like a fully functioning cockpit for the enterprise.

This transparency leads to confidence, a cornerstone of quality decision outcomes: “That confidence comes in two dimensions,” he says. “The first is: can you understand what the machine is doing? Do you have confidence that you know why it came to that prediction?

“The second element is that in order for the analytic to be useful, it’s gotta get out of the lab. And many times, I see that the analytic comes after the operationalization of a process, where there is more data, or a flow of data that’s well warranted to an analytic.”

For more information, also see: Best Data Analytics Tools

Bottom Line: The Analytic Becomes an Augmentation

This is where rubber meets road for applied intelligence: the analytic becomes an augmentation. And when the business has that transparency, they get comfortable, and they adopt the insight into their own operational workflow. That’s when the intended value out of the machine is felt.

“Platforms provide unification: bringing process, people, and tech together,” Waid says. And as AI evolves, with Large Language Models and quantum computing closing in, it’s fair to say that the practices of applied intelligence will provide critical stability, along with meaningful insights.

Also see: 100+ Top AI Companies 2023

The post Navigating the Perfect Storm with Applied Intelligence appeared first on eWEEK.

Deploying the Best of Both Worlds: Data Orchestration for Hybrid Cloud

Eric Kavanagh — Fri, 26 Jun 2020 22:09:00 +0000

Did someone tell your data to shelter in place? That wouldn’t make any sense, would it? Ironically, for vast troves of valuable enterprise data, that might as well be the case, because massive, compute-bound data silos are practically everywhere in the corporate world.

Hadoop played a role in creating this scenario, because many large organizations sought to leverage the twin promises of low-cost storage and massively parallel computing for analytics. But a funny thing happened to the yellow elephant: It was largely obviated by cheap cloud storage.

Seemingly overnight, the price of cloud storage dropped so precipitously that the cost-benefit analysis of using the Hadoop Distributed File System (HDFS) on-premises for new projects turned upside down. Even the term “Hadoop” disappeared from the names of major conferences.

That’s not to say there isn’t valuable data in all those HDFS repositories, however. Many important initiatives used this technology in hopes of generating useful insights. But with budgets moving away from Hadoop, another strategy is required to be successful.

What about computing? Suffice to say, cloud providers now offer a robust service in this territory. And relatively recent innovations such as separating computing from storage have also played a part in paving the way for cloud-based computing to take on all manner of workloads.

So the cloud now easily eclipses most on-premises environments in all the major categories: speed, cost, ease of use, maintenance, scalability. But there are barriers to entry; or at least pathways that should be navigated carefully while making the move. Mistakes can be very costly!

But how do you get the data there? Amazon actually offers a “snow truck” that will come to your data center, load it up one forklift at a time, and haul it, old-school, to its facility. That approach can certainly work for a quick-and-relatively-dirty solution, but it ignores the magic of cloud.

Seeing Triple

As the concept of “cloud-native” gets hashed out on whiteboards in boardrooms around the business world, the reality taking shape is that a whole new generation of solutions is being born. These systems are augmented with high-powered analytics and artificial intelligence.

This new class of application is almost exclusively built on a microservices architecture with Kubernetes as the foundation. There is tremendous value to this approach, because scalability is built into its DNA. Taking advantage of this new approach requires a commitment to change.

Simply shipping your data and applications en toto to a cloud provider absolutely does not solve this challenge. In fact, it will likely result in a significant rise in total cost of ownership (TCO), thus undermining a major driver for moving to the cloud.

Another strategy involves porting specific data sets into the cloud to deploy the power of all that computation. This often involves making copies of the data. While change-data-capture can be used to keep these disparate environments in sync, there are downsides to this approach.

In the first place, CDC solutions always need to be meticulously managed. Small amounts of data drift can quickly become larger problems. This is especially problematic when the derived analytics are used for mission-critical business decisions, or customer-experience initiatives.

Secondly, by going down this road, organizations risk the proliferation of even more data silos–this time in the cloud. And while cloud storage is getting cheaper, the cost of egress can creep up and throw budgets sideways; this is not good in a post-COVID world.

Remember, the standard redundancy of Hadoop was to have three copies of every datum, which is good for disaster recovery but rather taxing overall, both in terms of throughput and complexity. While moving into the new world of cloud computing, we should avoid old errors.

Agile Defined

A different approach to bridging the worlds of on-prem data centers and the growing variety of cloud computing services is offered by a company called Alluxio. From their roots at the Berkeley Amp Labs, they’ve been focused on solving this problem.

Alluxio decided to bring the data to computing in a different way. Essentially, the technology provides an in-memory cache that nestles between cloud and on-prem environments. Think of it like a new spin on data virtualization, one that leverages an array of cloud-era advances.

According to Alex Ma, director of solutions engineering at Alluxio: “We provide three key innovations around data: locality, accessibility and elasticity. This combination allows you to run hybrid cloud solutions where your data still lives in your data lake.”

The key, he said, is that “you can burst to the cloud for scalable analytics and machine-learning workloads where the applications have seamless access to the data and can use it as if it were local–all without having to manually orchestrate the movement or copying of that data.”

In this sense, Alluxio’s approach bridges the best of both worlds: You can preserve your investments in on-prem data lakes while opening a channel to high-powered analytics in the cloud, all without the encumbrance of moving massive amounts of data here or there.

“Data locality means bringing the data to compute, whether via Spark, Presto, or Tensorflow,” Ma said. In this scenario, Alluxio is installed alongside the compute framework and deploys unused resources on those servers to provide caching tiers for the data.

Options, Options

There are various ways to get it done, depending upon the topology of the extant information architecture. In some environments, if Presto is using a lot of memory, Alluxio can allocate SSDs on the appropriate machines for optimized caching.

If you’re tying into HDFS, Presto can make the request, and Alluxio’s intelligent multi-tiering then uses whatever the most efficient approach might be–spanning memory, SSD or spinning disc. It all can be optimized as Alluxio monitors data access patterns over time.

Regardless of which tools an organization uses–Tensorflow, Presto, Spark, Hive–there will be different usage patterns across CPU, GPU, TPU and RAM. In the case of RAM and available disk types, Alluxio can work with whatever resources are available.

“Spark is less memory-intensive,” Ma said, “so we can allocate some memory. So you have choice to figure out what you want to allocate and where. Alluxio allows you to seamlessly access the data in the storage area, wherever it may be.”

There’s also the concept of a Unified Name Space. “What it allows you to do is have a storage configuration that’s centrally managed,” Ma said. “You’re not going into Spark and Presto to set it all up; you’re able to configure Alluxio once, and then Spark or Presto communicate to Alluxio.”

The general idea is to create a high-speed repository of data that allows analysts to get the speed and accuracy they demand without giving into the temptation of data silos. Think of it as a very large stepping stone to the new normal of multi-cloud enterprise computing.

“With Alluxio, we sit in the middle and offer interfaces on both sides; so we can talk to a variety of different storage layers,” Ma said. “We act as a bridging layer, so you can access any of these technologies.” In short, you can have your data cake and eat it too.

Like any quality abstraction layer, solving the data challenge in this manner enables companies to leverage their existing investments. Data centers will have a very long tail, and cloud services will continue to evolve and improve over time. Why not get the best of both worlds?

Eric Kavanagh is CEO of The Bloor Group, a new-media analyst firm focused on enterprise technology. A career journalist with more than two decades of experience in print, broadcast and Internet media, he also hosts DM Radio, and the ongoing Webcast series for the Global Association of Risk Professionals (GARP).

The post Deploying the Best of Both Worlds: Data Orchestration for Hybrid Cloud appeared first on eWEEK.

Measure Twice, Cut Once: Getting a Modern ERP Deployment Right

Eric Kavanagh — Wed, 12 Feb 2020 09:29:45 +0000

The best carpenters make the fewest chips. ~ English proverb, c.1500s

Waste not, want not. This classic cliché keeps us all on notice, that frugality will pay dividends in more ways than one; and that frivolous behavior will not foster good fortune.

Carpenters understand this truism intrinsically, because they simply must. In the business of woodworking, you need to be extra careful not to push the envelope. Careful is as careful does.

This is especially true for the age-old practice of making furniture, a discipline that often demonstrates the finest skills a carpenter can achieve: design, artistry, attention to detail.

That’s a mantle the creator of Sauder Woodworking chose to carry. Launched by Erie Sauder from a barn behind his Archbold, Ohio, home, way back in 1934, the original focus was custom cabinetry and church pews. But before long, Erie saw growing opportunities.

Shortly after the saws first sang, he envisioned creating occasional tables from the leftover wood. This quickly developed into a line of business all to itself, so much so that in 1940, a traveling salesman who experienced this craftsmanship placed an unbelievable order.

Much to Sauder’s surprise, this enterprising agent asked for not 10, or 50, or even 500 tables. He placed an order for 25,000! The request seemed impossible at the time, but the burgeoning company rolled up its collective sleeves, broke out the tools and got to work.

This was an early example of a practice now referred to as “scaling out” the workload. Even back during the height of World War II, this particular company demonstrated a remarkable ability to meet customer demands, in an industry that requires a rather sophisticated supply chain.

But the spirit of innovation didn’t stop there. By 1953, Sauder took its game to the next level by rethinking the nature of furniture assembly. They created the first line of tables that would be shipped in parts, then assembled by the customer. So began the “some assembly required” era.

Modern Wonders

Today, Sauder excels as the leading producer of ready-to-assemble furniture in America, and is one of the top five residential furniture manufacturers in the country. They boast 4 million square feet of floor space, filled with high-tech furniture-making equipment from around the world.

And the commitment to the finest technology extends beyond the shop floor, all the way to the back office. In fact, Sauder is one of the first companies in the world to embrace the most advanced system for business management available today, a completely in-memory solution.

Yes, the venerable furniture manufacturer from small-town Ohio took the bull by its horns, and is now effectively running SAP S/4HANA, the next-generation Enterprise Resource Planning (ERP) solution by the German software giant. And they got it up and running in months.

“I’m excited to be running S/4HANA, and we’re glad that we can take advantage of newer technologies that are available today,” said Jan Arvay, Vice President of Information Technology for Sauder, who’s been with the company for more than 20 years.

Her colleague, Pam Vocke, echoed that sentiment: “I too have been with Sauder over 25 years, had the opportunity to be involved in many of our SAP implementations throughout the years, and we are very excited about being on S/4HANA and to share our story.”

Started Using SAP in 2004

Vocke explained that Sauder began implementing SAP back in 2004: “We started with order to cash, and we took a phased implementation approach, actually finishing up implementing all of the main modules by 2015; and that’s the time that S/4HANA was announced.”

This presented a significant conundrum. On the one hand, the decision could have been taken to optimize all the modules now in place on-premises. This would certainly require less work than rolling out the new in-memory solution. But S/4HANA would provide significant benefits.

Here’s where the DNA of this enterprise came shining through. Suffice to say, carpenters have a particular understanding of sunk costs. One false move and an expensive resource can become significantly less valuable. Hence the care and concern with which decisions are made.

They spent 2016 really evaluating their options, living up to the carpenter’s mantra: measure twice, cut once. And ultimately? The Sauder team doubled down; they decided to get way out ahead of SAP’s 2025 maintenance deadline.

“We’re in business to make furniture,” said Vocke. “We really wanted to be on our own timeline. We took the necessary steps, converted, and now we feel that we’re in a great position to take advantage and optimize the use of S/4HANA.”

Greenfield or Brownfield?

The next big decision? Whether to go greenfield (new implementation) or brownfield (system conversion). The former involves setting up a clean installation of the ERP, then importing all the relevant data. The latter focuses on converting existing system, leaving all the data in place.

“We’re in a great position because we took a brownfield approach, and we’re actually kicking off a finance optimization project,” Arvay said. “We’re going to hit different areas of the business based upon strategic initiatives, to turn on functionality and take advantage of what’s in it us for.”

Arvay explained that a key consideration for their decision to go brownfield was the number of employees who would be impacted by the conversion: all 2,000 of them! “We didn’t want to deal with a lot of change management, and actually, it was a smooth cut-over.”

How smooth was the transition? Let’s just say they hit the magic number: “The second night we had zero help desk calls,” Arvay said. Wow. Sometimes, zero is the ideal number.

“I think we were pleasantly surprised,” said Vocke. “We had a great group of consultants that were assisting us. We had SAP Value Assurance, and the Enterprise Support team at SAP.”

Now Sauder is prepared to scale out any which way, as they continue their nearly century-long run, carving out a reputation of quality, innovation and creativity.

Eric Kavanagh is CEO of The Bloor Group, a new-media analyst firm focused on enterprise technology. A career journalist with more than two decades of experience in print, broadcast and Internet media, he also hosts DM Radio, and the ongoing Webcast series for the Global Association of Risk Professionals (GARP).

The post Measure Twice, Cut Once: Getting a Modern ERP Deployment Right appeared first on eWEEK.

AI & Machine Learning Drive Data Economy

Eric Kavanagh — Tue, 10 Dec 2019 09:00:00 +0000

“Make it so!” Such is the ideal scenario for every manager everywhere, right? You have an idea? You figured it out? Just assign that task to a trusted colleague and watch as the magic happens. Well, maybe that’s a reality best expressed in a famous sci-fi TV series, but …

… the best news these days? Science fiction often predicts the future quite accurately; and the 1990s future is, well, today! In other words, the vision espoused by Star Trek: Next Generation, in many ways can now be qualified as realistic, at least with respect to the dynamic use of data.

Let’s face it: Data drives the Information Economy, especially when modern technologies such as artificial intelligence and machine learning are trained on it. That’s the key to the next generation of innovation for data-driven organizations, and it comes just in time.

Regulatory Drivers

Why just in time? For starters, compliance initiatives from around the globe are now closing in on old ways of doing business. The General Data Protection Regulation, adopted by the European Union and put into action on May 25, 2018, places serious mandates on any organization that caters to citizens of the EU.

The spectrum of impact that GDPR creates can barely be measured. Though not prescriptive, the regulation covers a vast swath of territory: lawfulness, transparency, purpose, minimization of data used, processing, accuracy, storage, security and accountability. Basically everything.

One of the most significant challenges associated with GDPR is the so-called right to be forgotten. In essence, this policy suggests that companies must eradicate all personal data of any consumer who demands as much. Those in the data industry realize how hard this truly is.

In the first place, data about consumers finds its way into numerous enterprise systems: enterprise resource planning, customer relationship management, business intelligence, sales and marketing automation–the list goes on and on. What’s more, redundancy reigns.

Then there’s the issue of identity resolution: who is this consumer, specifically? Rest assured, there are cases of mistaken identity. And to the point of redundant data, there are many cases of one consumer having multiple records at any given organization. This is very common.

While GDPR does not technically apply to non-EU citizens, it’s widely recognized as a straw in the wind, as evidenced by the arguably more stringent California Consumer Privacy Act, signed into law by Gov. Jerry Brown in 2018 and goes into effect Jan. 1. The stars have aligned; marketers must innovate.

Then there’s the New York Privacy Act, which goes further than either of its predecessors. Though NYPA was not enacted, it represents a serious shot across the bow, because it would allow citizens to sue individual companies for privacy violations, likely triggering an onslaught of lawsuits.

The risk of being sued en masse for privacy violations will send chills down the spines of executives and board members everywhere. A genuine strategy is necessary to effectively prepare for this contingency, one that is agile, embraces governance and respects privacy.

Business Drivers

Of course, regulations don’t appear out of nowhere. Social, cultural and political dynamics coalesce over time to create a groundswell of interest in a particular topic. Sometimes, major events spur such developments, such as with the Enron scandal, which led to Sarbanes-Oxley.

In this instance, long-simmering concerns around privacy and practice finally boiled over in the EU with GDPR. But stateside, we addressed similar concerns with the CAN-SPAM Act back in 2003, which regulates, but certainly does not prohibit, the practice of sending bulk emails.

Fast forward to the brink of 2020, and we find a vastly different world. The meteoric rise of Facebook, in and of itself, acted like a bull in the china shop of personal privacy. To wit: the social media giant got tagged by the Federal Trade Commission with a $5 billion fine!

This got just about everyone in the business world thinking much more seriously about the importance of privacy. No doubt, directors and rank-and-file members of data governance committees everywhere took note, redoubling their efforts to get senior management engaged.

Add to this the rapid maturation of mobile advertising combined with location intelligence, and there’s now a whole new world of opportunity–and responsibility–for organizations to embrace and understand. Simply put, the reality of omni-channel marketing is overwhelming.

The Right to be Respected

Getting back to our Star Trek metaphor, let’s think rationally, like Commander Data from Next Generation, or Spock from the original series: We can distill all these privacy concerns and related information practices down a a basic principle: the right to be respected.

In all iterations of the Star Trek genre, a code of conduct pervades, one that is collegial, diligent and purposeful. Every member of the crew is expected to treat each other with respect, regardless of rank or role. This standard of respect forms a foundation of trust and transparency.

As regards privacy regulations in general, the key for success in today’s world requires the ability to define clear policies, then map them to all manner of operational systems. Considering the scale of this challenge, traditional IT solutions will not suffice.

Most legacy systems lack the scalability, computational power and basic functionality to get the job done. However, with a cloud-centric, microservices architecture, old-world IT can be woven into a hybrid-cloud fabric, allowing much data and even some systems to remain intact.

This hybrid approach is the hallmark for the next generation of technologies which benefit from an array of advances: in-memory computing to deliver necessary speed; artificial intelligence and machine learning to tackle the tedious; and a cloud-based approach to centralize control.

The AI & Operations Cockpit

Which brings us back to the bridge, that place where Captains James T Kirk, Jean-Luc Picard and Katherine Janeway commanded their starships. Today’s leading-edge organizations can now assimilate the kind of power and control enjoyed by those fictional characters.

In the context of privacy, and the right to be respected, here are just a few examples of how the new era of data-driven organizations can tackle the challenges of today’s world:

machine learning algorithms can crawl across massive troves of data to dynamically determine where Personally Identifiable Information (PII) is located; this paves the way to addressing core concerns with GDPR, CCPA and related regulations;
artificial intelligence (in the form of neural networks, aka deep learning) can pave the way not only to self-driving cars, but also autonomous information systems, ranging from procurement to pricing, classification to customer service;
real-time processing can vastly improve customer experience, by dynamically provisioning information from a wide range of omni-channel systems, this informing representatives of the very latest developments, across practically any touchpoint with customers or clients;
machine-learning algorithms can successfully segment any collection of data points — whether for customers, products, incidents or other use cases — into discrete groups, thus enabling both frontline workers and decision-makers by clarifying focus areas, and thus expediting results; and
advanced analytics can reveal deep insights about customer sentiment, drawing from a range of sources including social media, text analytics, even voice and facial recognition software, which have reached the point of providing highly accurate assessments of temperament.

We may not yet have the triquarter, that nifty device used by Star Command to scan just about anything. And we might not yet have mastered the art of teleportation. But the nexus of AI, in-memory computing, inexpensive storage and human creativity? Sure feels a lot like Star Trek!

Eric Kavanagh is CEO of The Bloor Group, a new-media analyst firm focused on enterprise technology. A career journalist with more than two decades of experience in print, broadcast and Internet media, he also hosts DM Radio for Information Management Magazine, and the ongoing Webcast series for the Global Association of Risk Professionals (GARP).

The post AI & Machine Learning Drive Data Economy appeared first on eWEEK.

Batch Goes Out the Window: The Dawn of Data Orchestration

Eric Kavanagh — Wed, 04 Dec 2019 03:00:00 +0000

Around and around and around it goes … where it stops, nobody knows! At least not until now.

“It,” of course, is data, and its mechanism of transport has long been the tried-and-relatively-true practice of extract, transform, load, aka ETL. That’s now finally changing.

Granted, there have been other ways of moving data: Change data capture (CDC), one of the leanest methods, has been around for decades and remains a very viable option; the old File Transfer Protocol (FTP) can’t be overlooked; nor can the seriously old-fashioned forklifting of DVDs.

Go here to see eWEEK’s listing of Top ETL Tool Vendors.

Data virtualization 1.0 brought a novel approach as well. This approach leveraged a fairly sophisticated system of strategic caching. High-value queries would be preprocessed, and certain VIP users would benefit from a combination of pre-aggregation and stored result sets.

During the rise of the open-source Hadoop movement about a decade ago, some other curious innovations took place, notably the Apache Sqoop project. Sqoop is a command-line interface application for transferring data between relational databases and Hadoop. Sqoop proved very effective at pulling data from relational sources and dropping it into HDFS. That paradigm has somewhat faded, however.

But a whole new class of technologies–scalable, dynamic, increasingly driven by artificial intelligence–now threatens the status quo. So significant is this change that we can reasonably anoint a new term in the lexicon of information management: data orchestration.

There are several reasons why this term makes sense. First and foremost, as an orchestra comprises many different instruments–all woven together harmoniously. Today’s data world suddenly boasts many new sources, each with its own frequency, rhythm and nature.

Go here to see eWEEK’s Resource Page on Intent-Based Networking.

Secondly, the concept of orchestration implies much more than integration, because the former connotes significantly more complexity and richness. That maps nicely to the data industry these days: The shape, size, speed and use of data all vary tremendously.

Thirdly, the category of data orchestration speaks volumes about the growing importance of information strategy, arguably among the most critical success factors for business today. It’s no longer enough to merely integrate it, transport it or change it; data must be leveraged strategically.

Down the Batch!

For the mainstay of data movement over the past 30 years, ETL took the lead. Initially, custom code was the way to go, but as Rick Sherman of Athena IT Solutions once noted: “Hand coding works well at first, but once the workloads grow in size, that’s when the problems begin.”

As the information age matured, a handful of vendors addressed this market in a meaningful way, including Informatica in 1993, Ab Initio (a company that openly eschews industry analysts) in 1995, then Informix spin-off Ascential (later bought by IBM) in 2000.Those were the heydays of data warehousing, the primary driver for ETL.

Companies realized they could not effectively query their enterprise resource planning (ERP) systems to gauge business trajectory, so the data warehouse was created to enable enterprise-wide analysis.

The more people got access to the warehouse, the more they wanted. This resulted in batch windows stacking up to the ceiling. Batch windows are the time slots within which data engineers (formerly called ETL developers) had to squeeze in specific data ingestions.

Within a short span of years, data warehousing became so popular that a host of boutique ETL vendors cropped up. Then, around the early- to mid-2000s, the data warehouse appliance wave hit the market, with Teradata, Netezza, DATAllegro, Dataupia and others climbing on board.

This was a boon to the ETL business but also to the Data Virtualization 1.0 movement, primarily occupied by Composite Software (bought by Cisco, then spun out, then picked up by TIBCO) and Denodo Technologies. Both remain going concerns in the data world.

Big Data Boom

Then came big data. Vastly larger, much more unwieldy and in many cases faster than traditional data, this new resource upset the apple cart in disruptive ways. As mega-vendors such as Facebook, LinkedIn and others rolled their own software, the tech world changed dramatically. The proliferation of database technologies, fueled by open-source initiatives, widened the landscape and diversified the topography of data. These included Facebook’s Cassandra, 10gen’s MongoDB and MariaDB (spun out by MySQL founder Monty Widenius the day Oracle bought Sun Microsystems)–all of which are now pervasive solutions.

Let’s not forget about the MarTech 7,000. In 2011, it was the MarTech 150. By 2015, it was the MarTech 2,000. It’s now 7,000 companies offering some sort of sales or marketing automation software. All those tools have their own data models and their own APIs. Egad!

Add to the mix the whole world of streaming data. By open-sourcing Kafka to the Apache Foundation, LinkedIn let loose the gushing waters of data streams. These high-speed freeways of data largely circumvent traditional data management tooling, which can’t stand the pressure.

Doing the math, we see a vastly different scenario for today’s data, as compared to only a few years ago. Companies have gone from relying on five to 10 source systems for an enterprise data warehouse to now embracing dozens or more systems across various analytical platforms.

Meanwhile, the appetite for insights is greater than ever, as is the desire to dynamically link analytical systems with operational ones. The end result is a tremendous amount of energy focused on the need for … (wait for it!) … meaningful data orchestration.

For performance, governance, quality and a vast array of business needs, data orchestration is taking shape right now out of sheer necessity. The old highways for data have become too clogged and cannot support the necessary traffic. A whole new system is required.

To wit, there are several software companies focused intently on solving this big problem. Here are just a few of the innovative firms that are shaping the data orchestration space. (If you’d like to be included in this list, send an email to info@insideanalysis.com.)

Ascend

Going all the way upstream to solve the data provisioning challenge, Ascend uses AI to dynamically generate data pipelines based upon user behavior. In doing so, the company cut code by an estimated 80%, and is able to very quickly provision not only the pipelines themselves, but also the infrastructure necessary to run them. Ascend also enables the querying of data pipelines themselves, thus providing valuable visibility into the movement and transformation of information assets. This is a very clever approach for tackling the preponderance of data provisioning for an organization.

Dell Boomi

An early player in the cloud integration business, Boomi was kept as a strategic component of Dell’s software portfolio while other assets were sold off, arguably to help finance the 2016 acquisition of EMC. Boomi uses a comprehensive approach to hybrid cloud integration that includes a MasterData Hub, API management, AI-augmented workflow management and app development and B2B/EDI management. As such, it has become a premier vendor in the large-scale cloud integration space, especially for business-to-business connections.

GeminiData

Call it Data Virtualization 2.0. GeminiData touts a “Zero Copy” approach that grabs data as needed, as opposed to the ETL-oriented data warehouse model. The company’s founders have a history with Splunk and leveraged their experience with the systems monitoring giant by enabling the rapid integration of log data. They also have ties to Zoomdata, which created the “micro-query” concept. By allowing users to mix and match both structured and unstructured data, in a real-time sandbox that can connect to a whole range of different sources and targets, GeminiData has opened the door to a whole new architectural approach for data discovery.

HVR

Hailing from the original real-time database offering, Goldengate (later acquired by Oracle), the founders of HVR focused on the multisource, multitarget reality of today’s data world. Using log-based, change-data-capture for data replication, the company can solve all manner of data-provisioning challenges. Whether for data lakes or legacy operational systems, HVR gives its users the ability to replicate large data sets very quickly. The company prides itself on the durability and security of its platform, which is designed to serve as a marshaling area for a wide range of enterprise data.

Streamsets

Another innovator in this fast-developing space, Streamsets touts what it calls the industry’s first data operations platform. Recognizing the multifaceted, multivariate nature of modern data, the company built a control panel for data movement. Akin to a sophisticated water management system for a large, high-end spa, this platform builds upon several key design principles: low -code; minimalist schema design; both streaming and batch as first-class citizens; dev-ops capabilities for data; awareness of data drift; decouple data movement and infrastructure; enable inspection of data flows; and data “sanitization” upon ingest.

Eric Kavanagh is CEO of The Bloor Group, a new-media analyst firm focused on enterprise technology. A career journalist with more than two decades of experience in print, broadcast and Internet media, he also hosts DM Radio for Information Management Magazine, and the ongoing Webcast series for the Global Association of Risk Professionals (GARP).

The post Batch Goes Out the Window: The Dawn of Data Orchestration appeared first on eWEEK.