Samuel Greengard, Author at eWEEK

What is Predictive Analytics?

Samuel Greengard — Mon, 30 Jan 2023 19:36:29 +0000

Predictive analytics is the use of data, statistical algorithms, and artificial intelligence (AI) and machine learning (ML) techniques to identify the likelihood of future outcomes based on historical data. The goal is to go beyond knowing what has happened and assess what will happen.

Predictive analytics is being used far more in the enterprise. However, there is some confusion around it. Let’s examine what it is, how it differs from other areas of data analytics, and how it is used in the real world.

Ajay Khanna, CEO and Founder of Tellius, gives an example of inventory management during the peak holiday season. By applying predictive analytics models to in-house data over a certain time period, retailers can better understand consumer behavior, such as buying patterns, likelihood to return, and in-store foot traffic. This level of insight can help them forecast product demand, improve customer experience, and reduce operational costs through better staff and resource management.

“Retailers can reach guests with personalized offers based on past data and reliably predict and anticipate future purchases,” said Khanna.

To dig deeper into predictive analytics, jump ahead:

Predictive Analytics History and Growth
How Does Predictive Analytics Work?
Predictive Analytics Models
Predictive Analytics: Forward Looking
Benefits of Predictive Analytics
Business Use Cases for Predictive Analytics
Use Cases by Sector
Developing Predictive Analytics Capabilities
Challenges and Limitations of Predictive Analytics
Future of Predictive Analytics

Predictive Analytics History and Growth

Predictive analytics arguably began in the 1940’s with early, manual versions of computers. Notable innovations were accomplished within government agencies, like Alan Turing’s Bombe machine and the Manhattan Project’s Monte Carlo simulation to predict the behavior of atoms during a chain reaction. When computers came to the fore in the 1950’s, research organizations were able to make predictions about weather patterns and product lifetimes.

Predictive analytics, then, has been around for decades. But more organizations are now turning to it to improve their bottom line and competitive advantage. Why now? Computing power has increased dramatically, analytics software is more interactive and easier to use, and the embrace of the cloud has put analytics in the hands of more people at all skill levels. As a result, predictive analytics is no longer the exclusive domain of quantitative experts, statisticians, and data scientists.

“Now analysts, line-of-business experts, and front-line workers are applying predictive analytics to improve efficiency and effectiveness,” said Peterson. “With increased competition and challenging economic conditions, organizations across industries are looking to transform data into better, faster business decisions.”

Predictive analytics has emerged as a powerful tool for organizations large and small. The ability to apply machine learning to large volumes of data and uncover hidden patterns is increasingly valuable in fields as diverse as agriculture, manufacturing, transportation, financial services, healthcare, retail, and cybersecurity.

Of course, businesses have always used data to forecast events and make business decisions. However, the volume and complexity of today’s data has changed the equation. Machine learning and artificial intelligence can spot patterns that fly below human perception and processing. As a result, predictive analytics is increasingly viewed as a competitive differentiator.

According to a report from online research service Statistica, the global predictive analytics market is projected to grow from $5.29 billion USD in 2020 to nearly $42 billion USD in 2028. Organizations use predictive for a wide range of purposes, but some of the leading use cases include: analyzing consumer behavior, managing supply chains, cutting costs, and making strategic decisions about business operations, including financial forecasting.

A variety of vendors offer predictive analytics solutions, either as stand-alone software or built into enterprise applications, including enterprise resource planning (ERP) and customer relationship management (CRM) platforms. Some are available on the desktop and others in the cloud as software as a service (SaaS). This includes the likes of AWS, Google, IBM, Microsoft, Oracle, Salesforce, SAP, SAS, Tableau, Teradata, TIBCO, and ThoughtSpot. While these solutions vary greatly, the common denominator is to extract actionable results from data.

Also see: Top Business Intelligence Software

How Does Predictive Analytics Work?

Predictive analytics represents a distinctly different category of analytics than data mining, business intelligence, and more conventional analytics methods. It ventures beyond basic data sorting and reporting and enters the realm of analysis through statistical methods, machine learning, and deep learning. In its most advanced form, it moves into the category of prescriptive analytics, which offers highly specific outcomes and recommendations based on different decisions or scenarios.

Essentially, algorithms tap statistical methods to parse through different types of structured and unstructured data. This may consist of historical records such as point of sale (POS) or purchase histories or human or network behavior. It can also include social media, online browsing patterns, and other data.

Gartner notes that there are five primary components to predictive analytics:

An emphasis on prediction rather than description, classification, or clustering.
Rapid analysis measured in hours or days rather than the usual months of traditional data mining and BI.
An emphasis on the business relevance of insights.
A focus on ease of use, thus making tools more accessible to line-of-business users.
Predictive analytics tools pull data from numerous sources.

Also see: Best Data Analytics Tools

Predictive Analytics Models

A predictive analytics solution generates predictions using models and methods that often revolve around four core techniques.

Regression Models

The regression model approach is frequently referred to as “what if” analysis. It estimates the relationship between independent variables and then builds a model that can make predictions about future scenarios and impacts. Regression models can incorporate correlations (relationships) and causality (reasons). Manufacturers and retailers often use this method to predict things like demand and fashion trends.

Classification Models

With classification models, data scientists plug in past data and histories. The predictive analytics solution labels the data and then uses an algorithm to identify patterns, including correlations. As new data arrives, it’s added to the system. Fraud detection and cybersecurity typically use classification models.

Clustering Models

This technique searches for common attributes and characteristics and then places them in groups. Clustering models are ideal for finding hidden patterns in systems. The technique is frequently used to identify patterns of fraud and theft.

Time-Series Models

The ability to view data over days, months, or years delivers additional perspective, which can be plugged into a predictive model. The time-series model is frequently used in healthcare and marketing for tasks as varied as optimizing staffing to predicting human behavior based on a complex set of factors.

Also see: Data Mining Techniques

Predictive Analytics: Forward Looking

Without analytics, data is just a series of zeros and ones. But with analytics comes insights, better decisions and improved outcomes. It turns data into value.

“In general, if you aren’t talking about predictive analytics, you’re talking about descriptive or prescriptive analytics,” said Jerod Johnson, Senior Technology Evangelist, CData. “Descriptive analytics shows what has already happened through data mining, helping you to identify trends and patterns. Predictive analytics adds modeling and machine learning to predict possible future outcomes and probabilities.”

Mathias Golombek, CTO for Exasol, explains prescriptive analytics as a category that takes data and turns it into actionable insights and decisions. You could call it operational BI or analytics, which can be implemented using either traditional SQL or data science languages scripts. The key is to be as real-time relevant as possible and take direct decisions out of data.

“That’s why most of those applications are written in software code and trigger actions across your business chain,” said Golombek. “One example would be to automatically optimize the prices for your e-commerce shop by crunching all kinds of relevant data about your customers, products and logistic chains.”

Predictive analytics, as its name suggests, is forward-looking. “Predictive analytics uses historical data and sophisticated models to predict what will happen next, what the optimal outcomes may be, and where to focus effort and resources,” said Jared Peterson, Senior Vice President of Engineering at SAS.

Small, incremental improvements in a marketing campaign, for example, or in a bank’s fraud detection or a manufacturer’s predictive maintenance can lead to big savings and enhanced operations.

Golombek added that predictive analytics brings AI and ML algorithms to the data, enabling businesses to perform analytical decision making and predictions. It mostly uses script languages such as Python or R and applies statistical models that are trained by existing training data.

Benefits of Predictive Analytics

The benefits of predictive analytics fall into several categories:

Improved Decision-Making

As organizations accumulate data and use it to spot patterns and trends, an enterprise can better understand factors that correlate and cause certain conditions to occur. This data can not only be used by humans to build more effective strategies but also be embedded in automated systems. For the latter, AI and machine learning can act automatically and autonomously when a certain set of conditions occur.

Increased Efficiency

By understanding how certain conditions lead to certain outcomes, it’s possible to eliminate intermediate steps and manual processes that require time, money, and other resources. Predictive maintenance, for example, reduces and sometimes even eliminates the need for humans to test and review results for equipment. An organization knows when it’s an optimal time to service a machine or device.

Risk Reduction and Management

Predictive analytics tools can spot operational, regulatory and cybersecurity risks. It can find gaps, vulnerabilities and weaknesses in business plans, financial models, and IT frameworks. This aids in reducing direct costs as well as possible penalties and fines resulting from a failure to abide by regulations and other controls.

Better Competitive Intelligence

Organizations that use predictive analytics well gain deeper insights into business events, trends, and likely outcomes. This information can guide investments, sourcing, research and development (R&D), sustainability initiatives, supply chain decisions, and much more.

Higher Revenues and Increased Profits

When predictive analytics is used successfully in marketing and sales, for example, it results in higher customer engagement and additional purchases. In a best-case scenario, the technology can dramatically boost brand affinity by making communications and interactions highly relevant for customers. They receive messaging at the right time and in the right place.

Also see: What is Data Visualization

Business Use Cases for Predictive Analytics

The financial services industry, with huge amounts of data and money at stake, has long embraced predictive analytics to detect and reduce fraud, measure and manage risk, maximize marketing opportunities, and retain customers. Banks of all sizes rely on predictive analytics.

Even traditionally sluggish adopters of new technology like manufacturing and government are becoming proponents of predictive analytics. It helps them to improve operations and boost resiliency in the face of economic disruption.

For example:

Mack Trucks and Volvo Trucks use AI and IoT analytics to predict maintenance issues in their connected vehicles. This prevents costly breakdowns.
Georgia-Pacific relies on AI and IoT analytics to optimize its supply chain and shipping logistics, improve manufacturing equipment efficiency, and reduce downtime.
The Town of Cary, NC uses predictive and IoT analytics and data from sensors in streams to predict and mitigate the effects of inland flooding. This is a problem many municipalities are experiencing with greater frequency.

Search-powered data intelligence platforms can help businesses simplify the process of mining for key metrics. By combining disparate datasets and delivering information in an easy-to-consume format through powerful visualizations and predictive analytics, businesses get unprecedented access to key insights – without requiring advanced data science skills.

In the realm of subscription services and in customer support, too, organizations want to understand which users and customers are likely to upgrade or likely to churn. Customers are scored against many attributes and criteria to assess their customer health. Any organization concerned with the maintenance of high-value items can build predictive models to understand which and when hardware and software products will fail or come out of compliance.

Here are additional business uses cases for predictive analytics:

Resource Planning and Purchasing

Predictive analytics can provide insights into projected raw materials availability and pricing, including when to purchase raw materials and commodities. These systems operate similarly to AI systems that predict airline prices at travel websites. This type of modeling helps lower costs and optimize inventory.

Quality Control and Predictive Maintenance

Another use for the technology is in quality control and predictive maintenance. Predictive analytics can detect when products have likely become spoiled or damaged during shipment, and it can optimize maintenance and repairs for equipment as far ranging as medical devices and jet engines.

Marketing

Retailers, financial services companies, healthcare providers, and others are using predictive analytics to improve marketing, tweak products and services, and forecast outcomes, including sales and broader market trends.

A retailer might pick up signals that a customer is inclined to purchase a product or upgrade a service, or a healthcare company might use predictive analytics to better understand how various actions and behaviors reduce the risk of a negative outcome, including on an individual basis.

Security and Risk Management

As attacks have become more sophisticated, it’s increasingly difficult to simply blacklist and whitelist malware or attempt to block packets at the edge of the network. Behavioral-based security is an important component in developing a zero-trust security framework and locking down assets and data in a more comprehensive way. Predictive analytics tools—which harness AI and machine learning—can spot issues before they emerge as full-fledged problems.

Credit scores are used to assess a buyer’s likelihood of default for purchases and are a well-known example of predictive analytics. A credit score is a number generated by a predictive model that incorporates all relevant data. Other risk-related uses include insurance claims and collections.

Fraud Detection

Predictive analytics can flag questionable transactions and spot potential fraud. Banks and credit card companies use predictive technology—increasingly systems linked to geolocation data provided by an individual’s smartphone—to determine whether a purchase is valid or questionable.

This approach offers benefits for customers, who are no longer subjected to frequent emails and text messages that ask them to call a financial services firm to validate transactions. Suspended accounts and other issues are particularly troublesome when a person is traveling overseas and has to make a long-distance call to the bank to verify transactions.

Combining multiple analytics methods can improve pattern detection and prevent criminal behavior. High-performance behavioral analytics examines all actions on a network in real time to spot abnormalities that may indicate fraud, zero-day vulnerabilities and advanced persistent threats.

Also see: Real Time Data Management Trends

Predictive Analytics Use Cases by Sector

Healthcare

Healthcare organizations leverage predictive analytics to manage the care of patients by predicting their diagnoses and properly staffing hospitals and clinics for future infections.

Supply chain

Supply chains use predictive analytics to better manage inventory and avoid overstocking, and adjust prices based on predicted demand and availability of component parts.

Helpdesk centers

Predictive analytics helps deploy models to audio recordings between support staff and customers to improve agent performance, reduce call durations, gather additional customer information and elevate the overall customer experience.

Hospitality

To make capacity management more seamless, hotels are applying predictive models to data over a certain period so that they can better forecast, plan for, and improve on guest services while simultaneously reducing operational costs through better staff, inventory, and other resource management.

Educational institutions

Deploying predictive algorithms to historic student data can identify early indicators of declining student performance as well as the surrounding factors that may contribute to this. Additionally, predictive models applied to teacher, department or regional metrics expand the possibilities of what data-driven insights can do to improve the performance of education systems.

HR and Recruitment

Organizations tend to hire based on an analysis of the job candidate’s interview performance, job references, network, and formal credentials, which are all historical data points. The process is outdated and subjective. “The expense of a bad hire is at least 30% of their salary, but hiring a person who isn’t the best person for the job also presents significant opportunity costs,” said Satish Kumar, CEO of Glider AI. “A predictive analysis of talent quality is the future; it eliminates hiring based on formal credentials, with a focus on skill and cultural fit, while removing natural hiring biases.”

Also see: Best Machine Learning Platforms

Developing Predictive Analytics Capabilities

More advanced predictive analytics capabilities are also taking shape. For example, organizations are turning to digital twins to simulate complex models and understand how different factors impact real world results. Moreover, wineries are using AI with data to understand how climate change impacts their grape crops, while some vintners are beginning to use these methods to identify land that will be ideally suited for viticulture as climate change unfolds.

Meanwhile, gaming companies use predictive analytics algorithms to render 3D graphics faster by eliminating the need to generate certain pixels on GPUs. The system performs extrapolations, and the technique saves computing cycles and cuts energy consumption.

In fact, the latter example demonstrates how data scientists can combine predictive analytics with deep learning techniques. Neural networks can digest huge volumes of data and spot obscure patterns and trends in video, audio, text, and other forms of unstructured data.

For instance, voice recognition or facial recognition might analyze the tone or expression a person displays, and a system then responds accordingly. An application like Google Mail can predict the next word or phrase a person is likely to use and present it as a choice, and Open AI’s ChatGPT constructs entire paragraphs on almost any topic based on text input.

Challenges and Limitations of Predictive Analytics

Although predictive analytics offers many benefits, it isn’t without some caveats and potential pitfalls. There are several factors that organizations must tune into in order to use the technology successfully.

The Role of Predictive Analytics and How it Generates Value

Although predictive analytics delivers visibility into the future, it isn’t a crystal ball. Some factors, such as stock market performance, are far too complex to predict. In other cases, numerous other factors that intersect with predictive analytics impact the results.

For example, a marketing group may possess excellent data about customer behavior but fail in a campaign because it has developed subpar content, developed a haphazard approach, or used the predictive data poorly.

The Need for Accurate and Up-to-Date Data

When organizations use old or irrelevant data they wind up with wildly inaccurate results. In order to extract value from data, it must be current (in many cases real-time), accurate, and assembled in the right way. This usually requires data scientists along with top-notch predictive analytics and machine learning tools.

The Need for Clear Goals Surrounding Objectives

Predictive analytics in the absence of a clear strategy and goals will inevitably result in failure. Building a framework for the use of predictive analytics requires input from business leaders and, in many cases, various departments and groups. The most successful implementations span people, processes, and technologies.

This framework makes it possible to remap workflows and drive strategic, financial, and other gains through an enterprise and beyond.

The Need for Data Science Expertise

Predictive analytics tools are often designed primarily for data scientists. Even those intended for business analysts and others can require some level of technical knowledge. This may include programming skills such as Python or R, or expertise in statistical modeling methods. There are also a variety of technical issues related to data preparation and cleansing, training algorithms, dealing with data inconsistencies, and deploying models in the real world.

Also see: Top Digital Transformation Companies

Future of Predictive Analytics

In our increasingly digitalized world, data volumes are expected to almost double in size from 2022 to 2026, according to IDC. Therefore, the above use cases will probably lose their dominance as predictive analytics spreads to other fields.

“Companies across every industry stand to benefit from predictive analytics capabilities and advanced data management tools,” said Golombek. “As we move into the new year, we expect an uptick in the use of predictive and prescriptive analytics to drive continuous process improvements and data-driven decision-making — as well as help companies sell the right products to the right clients and facilitate better matching of resources and smarter recognition of trends.”

Johnson believes the future is data-driven, and access to data is the key to success for predictive analytics. The increase in accessible computational power and advancements in AI and machine learning technologies allows any business to utilize predictive analytics – not just organizations and industries with historically deep pockets.

“Utilizing real-time, no-code data connectivity solutions can further democratize analytics by allowing business users to build holistic analytics processes across multiple applications and systems,” said Johnson.

Predictive analytics will continue to evolve. As more and more sensors and IoT elements are plugged into IT frameworks, larger volumes of data—along with more granular data—will become more prevalent. It’s likely that future systems will deliver far more detailed insights into consumer behavior, health factors, spending patterns, and even sustainability data used for environmental, social, and governance (ESG) reporting. This includes far more detailed carbon accounting methods.

In addition, data visualization models are likely to become more elaborate and intuitive, including the use of more advanced 3D animations and visual simulations. And with no-code and low-code frameworks, predictive analytics solutions are likely to become easier to use. As various machine learning, deep learning, and AI frameworks improve, predictive analytics will almost certainly become more accurate and dependable for longer-range predictions and projections.

In the end, one thing is entirely clear: Predictive analytics is an important part of today’s business world, and the use of the technology will only increase. The ability to spot patterns, trends, and opportunities is a powerful tool for organizations of all shapes and sizes. It’s a key to unlocking value and future gains.

Also see: Digital Transformation Guide: Definition, Types & Strategy

Drew Robb contributed reporting for this article.

The post What is Predictive Analytics? appeared first on eWEEK.

ChatGPT: Understanding the ChatGPT AI Chatbot

Samuel Greengard — Thu, 29 Dec 2022 19:33:11 +0000

Fueled by artificial intelligence, ChatGPT (Generative Pre-trained Transformer) is an AI chatbot that uses advanced natural language processing (NLP) to engage in realistic conversations with humans.

ChatGPT can generate articles, fictional stories, poems and even computer code. ChatGPT also can answer questions, engage in conversations and, in some cases, deliver detailed responses to highly specific questions and queries.

Harvard Business Review has described the ChatGPT as a “tipping point for AI.” When a user types a question, command or comment into a dialog box in the ChatGPT engine, it delivers a near-immediate text-based response in the same language.

One thing that sets ChatGPT apart from other chatbots and NLP systems is its ultrarealistic conversational skills, including an ability to ask follow-up questions, admit mistakes and point out nuances about a topic. In many cases, it’s impossible to detect that a human is interacting with a computer-generated bot. Grammatical and syntax errors are rare and written constructions are logical and articulate.

GPT3 was introduced in November 2022 and gained over one million users within a week. It is currently in a research preview phase that allows individuals and businesses to use it at no charge.

This conversational AI tool is part of a growing wave of chatbots and personal assistants that harness natural language processing so that humans can interact with computers in a more natural and intuitive way. However, the platform isn’t without concerns. Some observers worry about students and others using GPT3 to generate essays and reports, while many worry about its potential impact on fields such as journalism and technical writing.

Also see: What is Artificial Intelligence

ChatGPT and OpenAI

ChatGPT was developed by Open AI, a company that develops artificial intelligence (AI) and natural language tools.

OpenAI’s stated aim is develop AI tools that “benefit all of humanity.” The firm was founded in 2015 as a non-profit entity by leading experts in the field, including entrepreneur Sam Altman (CEO) and technologist Greg Brockman (CTO).

OpenAI started with US $1 billion in venture capital funding. Then, in 2019, Microsoft invested US $1 billion in the company and the firm became a “capped” (100x on any investment) for-profit company. If the firm reaches that point, any additional profits will be returned to the public.

OpenAI introduced its first NLP language model, Generative Pre-Trained Transformer 3 (GPT-3), in June 2020. The platform includes an API that is available for commercial purchase. GPT-3 made it possible to answer questions, generate computer code in languages such as Python and generate text in different spoken languages.

OpenAI’s ChatGPT is a more advanced publicly available tool based on GPT-3.5. In addition, OpenAI offers an NLP image generation platform called DALL-E, which generates realistic images based on natural language input.

Former Google, Tesla and Leap Motion executives who are leading experts on artificial intelligence and machine learning are part of OpenAI’s leadership team and technical workforce.

Also see: Top AI Software

How Does ChatGPT Work?

The goal of developing natural language systems that operate in a highly convincing way has been taking shape over the last century. Films such as 2001 a Space Odyssey and Her have explored the idea of machines that can communicate in convincing—what some describe as meaningful and even sentient—ways.

Over the last decade, more powerful computing frameworks, including graphical processing units (GPUs), along with markedly improved algorithms, have fueled enormous advances in deep learning and NLP.

OpenAI originally built the GPT 3.5 language model from web content and other publicly available sources. It then used supervised machine learning techniques to build ChatGPT. Human trainers played the role of both the user and the AI agent—generating a variety of responses to any given input and then evaluating and ranking them from best to worst. This data was used to train a reward model.

An OpenAI reinforcement learning algorithm called Proximal Policy Optimization (PPO), which relies on a technique similar to Stochastic Gradient Descent, fine-tuned results. The result was ultra-fast performance with reduced computational power required to operate the NLP framework.

OpenAI used the Azure AI supercomputer infrastructure to tackle the training process. It completed the task in late 2021. ChatGPT incorporates a stateful approach, meaning that it can use previous inputs from the same session to generate far more accurate and contextually relevant results. It incorporates a moderation filter that screens racist, sexist, biased, illegal and offensive input.

However, the system has a limited ability to generate results for events that occurred after its primary training phase. As a result, information gaps are sometimes visible, and many recent events aren’t reflected in ChatGPT. Some information is also outdated. The system also lacks information about certain people, including celebrities.

The ChatGPT platform currently has some limitations, according to OpenAI. These include sometimes nonsensical answers, a tendency to be verbose, and an inability to ask appropriate clarifying questions when a user enters an ambiguous query or statement. In some cases, changing a word or two can dramatically alter the outcome within ChatGPT.

However, OpenAI monitors responses and feedback using an external content filter. This helps the company flag false positives and false negatives (and other issues) along with potentially harmful output. The information is used to update and improve the AI model.

ChatGPT, NLP and AI Chatbots

Amazon, Google, Microsoft, Baidu, GitHub, Meta, Apple, IBM, Nvidia, Baidu and others have also constructed natural language processing frameworks that deliver different features and capabilities. These include digital assistants, predictive coding tools and chatbots.

However, OpenAI’s ChatGPT is currently considered by many to be the most advanced NLP chatbot engine. It typically delivers remarkably accurate and engaging responses to wide-ranging questions and queries about technology, science, business, history, sports, literature, culture, art and much more.

Also see: What is Data Visualization

ChatGPT and Business

The ability to generate realistic and easy-to-understand text could fundamentally change business. Among other things, it could help companies develop websites, reports, marketing materials, human resources handbooks and many other text-based assets. It could aid in tasks related to complex data analysis.

In addition, customer support and self-help could change drastically with systems that deliver accurate insights and fixes for problems—including support across multiple languages. AI chatbots could also aid law firms, medical professionals and many others.

As labor shortages accelerate and the need for automation grows, tools such as ChatGPT could help fill the gap—particularly for more mundane tasks and content that doesn’t require a high level of creativity or insight. In many cases, AI chatbots with NLP capabilities could speed content creation but also help organizations achieve greater flexibility, including one-to-one content personalization.

Likewise, ChatGPT could help schools, non-profit organizations and government agencies generate written materials and deliver technical support with limited budgets and staffing.

Here’s an example of how ChatGPT responded to the question: What are the business benefits of chatbots?

Chatbots are computer programs that use natural language processing and other artificial intelligence technologies to communicate with humans in a conversational manner. Chatbots are often used in customer service and other business contexts to help companies interact with their customers more efficiently and effectively. Some of the potential benefits of chatbots for businesses include:

Improved customer service: Chatbots can handle a large volume of customer inquiries and provide timely and accurate responses, which can help improve customer satisfaction and reduce the workload for human customer service representatives.

Reduced costs: Chatbots can automate many routine tasks, such as answering frequently asked questions, which can save businesses time and money.

Increased sales: Chatbots can provide personalized product recommendations and other information to customers, which can help businesses increase sales and revenue.

Improved customer engagement: Chatbots can help businesses maintain regular communication with their customers and provide personalized, relevant information to keep them engaged with the brand.

These are just some of the potential benefits of chatbots for businesses. The exact benefits will depend on the specific chatbot and how it is used by the business. If you would like to learn more, I suggest looking up additional information about chatbots and their potential benefits for businesses.

Also see: AI vs. ML: Artificial Intelligence and Machine Learning

Concerns About ChatGPT

The ChatGPT platform is currently in a beta test phase. Although it has received mostly favorable reactions, the tool isn’t without issues and critics. In some cases—as a result of using statistical methods rather than creating a way to understand the meaning of actual language—it generates simplistic, incorrect, disturbing and even shocking responses. It also sometimes flunks basic math problems. Worse, the system can be used to generate phishing emails free of errors. And it has produced content that is racist or sexist when users applied tricks to bypass the system’s filters.

For now, Open AI describes the ChatGPT platform as a tool designed to complement humans rather than replace them. For example, it cannot yet generate footnotes and, while its answers are often accurate and engaging, they sometimes don’t represent the complete picture and they aren’t always synced with the specific messaging that a marketing team or other business function might require.

In a worst-case scenario, the AI engine produces text that’s well-written but completely off target or wrong. Thus, humans might plug deceptive or incorrect ChatGPT text into a document or use it to intentionally deceive and manipulate readers.

Other concerns exist. One revolves around the possibility that students will be able to generate high quality essays and reports without actually researching or writing them. Another is that the technology could lead to the end of many jobs, particularly in fields such as journalism, scriptwriting, software development, technical support and customer service. The AI platform could also deliver a more sophisticated framework for web searches, potentially displacing search engines like Google and Bing.

Finally, some have complained that the platform should not be regulated for speech and content.

Also see: Best Data Analytics Tools

Future of ChatGPT

It’s highly likely that within a few years the ChatGPT platform and other AI-based NLP tools will play a major role in the business world—and in everyday life. They could enhance and perhaps supplant today’s search engines, redefine customer service and technical support functions, and introduce more advanced ways to generate written content. They will also lead to advances in digital assistants such as Siri and Alexa.

Although some observers have predicted that natural language processing could eliminate many jobs, the technology is more likely to play a niche but expanding role in eliminating routine tasks and non-creative functions. For example, ChatGPT or a similar bot might generate text or computer code, but a human would then review it and possibly enhance it. In many cases, these businesses would benefit by automating tasks and redeploying humans for more strategic functions.

While it’s tempting to consider chatbots and other NLP frameworks sentient—the ability to display human-like feelings and sensations—linguistics experts and computer scientists caution that these systems simply mirror language and deliver convincing responses to various forms of input.

Nevertheless, AI chatbots and other NLP systems are rapidly redefining and rewiring the way humans and machines interact. In the coming years, ChatGPT and others will enable new products, services and features. Businesses leaders should monitor the technology, experiment with it and be ready to move forward when the right opportunity appears.

Also see: Best Machine Learning Platforms

The post ChatGPT: Understanding the ChatGPT AI Chatbot appeared first on eWEEK.

What Is Natural Language Processing?

Samuel Greengard — Mon, 28 Nov 2022 20:21:19 +0000

Natural language processing (NLP) is a branch of artificial intelligence (AI) that focuses on computers incorporating speech and text in a manner similar to humans understanding. This area of computer science relies on computational linguistics—typically based on statistical and mathematical methods—that model human language use.

NLP plays an increasingly prominent role in computing—and in the everyday lives of humans. Smart assistants such as Apple’s Siri, Amazon’s Alexa and Microsoft’s Cortana are examples of systems that use NLP.

In addition, various other tools rely on natural language processing. Among them: navigation systems in automobiles; speech-to-text transcription systems such as Otter and Rev; chatbots; and voice recognition systems used for customer support. In fact, NLP appears in a rapidly expanding universe of applications, tools, systems and technologies.

In every instance, the goal is to simplify the interface between humans and machines. In many cases, the ability to speak to a system or have it recognize written input is the simplest and most straightforward way to accomplish a task.

While computers cannot “understand” language the same way humans do, natural language technologies are increasingly adept at recognizing the context and meaning of phrases and words and transforming them into appropriate responses—and actions.

Also see: Top Natural Language Processing Companies

Natural Language Processing: A Brief History

The idea of machines understanding human speech extends back to early science fiction novels. However, the field of natural language processing began to take shape in the 1950s, after computing pioneer Alan Turing published an article titled “Computing Machinery and Intelligence.” It introduced the Turing Test, which provided a basic way to gauge a computer’s natural language abilities.

During the ensuing decade, researchers experimented with computers translating novels and other documents across spoken languages, though the process was extremely slow and prone to errors. In the 1960s, MIT professor Joseph Weizenbaum developed ELIZA, which mimicked human speech patterns remarkably well. Over the next quarter century, the field continued to evolve. As computing systems became more powerful in the 1990s, researchers began to achieve notable advances using statistical modeling methods.

Dictation and language translation software began to mature in the 1990s. However, early systems required training, they were slow, cumbersome to use and prone to errors. It wasn’t until the introduction of supervised and unsupervised machine learning in the early 2000s, and then the introduction of neural nets around 2010, that the field began to advance in a significant way.

With these developments, deep learning systems were able to digest massive volumes of text and other data and process it using far more advanced language modeling methods. The resulting algorithms had become far more accurate and utilitarian.

Also see: Top AI Software

How Does Natural Language Processing Work?

Early NLP systems relied on hard coded rules, dictionary lookups and statistical methods to do their work. They frequently supported basic decision-tree models. Eventually, machine learning automated tasks while improving results.

Today’s natural language processing frameworks use far more advanced—and precise—language modeling techniques. Most of these methods rely on convolutional neural networks (CNNs) to study language patterns and develop probability-based outcomes.

For example, a method called word vectors applies complex mathematical models to weight and relate words, phrases and constructs. Another method called Recognizing Textual Entailment (RTE), classifies relationships of words and sentences through the lens of entailment, contradiction, or neutrality. For instance, the premise “a dog has paws” entails that “dogs have legs” but contradicts “dogs have wings” while remaining neutral to “all dogs are happy.”

A key part of NLP is word embedding. It refers to establishing numerical weightings for words in specific context. The process is necessary because many words and phrases can mean different things in different meanings or contexts (go to a club, belong to a club or swing a club). Words can also be pronounced the same way but mean different things (through, threw or witch, which). There’s also a need to understand idiomatic phrases that do not make sense literally, such as “You are the apple of my eye” or “it doesn’t cut the mustard.”

Today’s models are trained on enormous volumes of language data—in some cases several hundred gigabytes of books, magazines articles, websites, technical manuals, emails, song lyrics, stage plays, scripts and publicly available sources such as Wikipedia. As the deep learning system parse through millions or even billions of combinations—relying on hundreds of thousands of CPU or GPU cores—they analyze patterns, connect the dots and learn semantic properties of words and phrases.

It’s also often necessary to refine natural language processing systems for specific tasks, such as a chatbot or a smart speaker. But even after this takes place, a natural language processing system may not always work as billed. Even the best NLPs make errors. They can encounter problems when people misspell or mispronounce words and they sometimes misunderstand intent and translate phrases incorrectly. In some cases, these errors can be glaring—or even catastrophic.

Today, prominent natural language models are available under licensing models. These include the OpenAI codex, LaMDA by Google, IBM Watson and software development tools such as CodeWhisperer and CoPilot. In addition, some organizations build their own proprietary models.

Also see: Top Business Intelligence Software

How is Natural Language Processing Used?

There are a growing array of uses for natural language processing. These include:

Conversational AI. The ability of computers to recognize words introduces a variety of applications and tools. Personal assistants like Siri, Alexa and Microsoft Cortana are prominent examples of conversational AI. They allow humans to make a call from a mobile phone while driving or switch lights on or off in a smart home. Increasingly, these systems understand intent and act accordingly. For example, chatbots can respond to human voice or text input with responses that seem as if they came from another person. What’s more, these systems use machine learning to constantly improve.

Machine translation. There’s a growing use of NLP for machine translation tasks. These include language translations that replace words in one language for another (English to Spanish or French to Japanese, for example). Google Translate and DeepL are examples of this technology. But machine translation can also take other forms. For example, NLP can convert spoken words—either in the form of a recording or live dictation—into subtitles on a TV show or a transcript from a Zoom or Microsoft Teams meeting. Yet while these systems are increasingly accurate and valuable, they continue to generate some errors.

Sentiment analysis. NLP has the ability to parse through unstructured data—social media analysis is a prime example—extract common word and phrasing patterns and transform this data into a guidepost for how social media and online conversations are trending. This capability is also valuable for understanding product reviews, the effectiveness of advertising campaigns, how people are reacting to news and other events, and various other purposes. Sentiment analysis finds things that might otherwise evade human detection.

Content analysis. Another use case for NLP is making sense of complex systems. For example, the technology can digest huge volumes of text data and research databases and create summaries or abstracts that relate to the most pertinent and salient content. Similarly, content analysis can be used for cybersecurity, including spam detection. These systems can reduce or eliminate the need for manual human involvement.

Text and image generation. A rapidly emerging part of natural language processing focuses on text, image and even music generation. Already, some news organizations produce short articles using natural language processing. Meanwhile, OpenAI has developed a tool that generates text and computer code through a natural language interface. Another OpenAI tool, dubbed Dall-E-2, creates high quality images through an NLP interface. Type the words “black cat under a stairway” and an image appears. GitHub Copilot and Amazon CodeWhisperer can auto-complete and auto-generate computer code through natural language.

Also see: Top Data Visualization Tools

NLP Business Use Cases

The use of NLP is increasingly common in the business world. Among the top use cases:

Chatbots and voice interaction systems. Retailers, health care providers and others increasingly rely on chatbots to interact with customers, answer basic questions and route customers to other online resources. These systems can also connect a customer to a live agent, when necessary. Voice systems allow customers to verbally say what they need rather than push buttons on the phone.

Transcription. As organizations shift to virtual meetings on Zoom and Microsoft Teams, there’s often a need for a transcript of the conversation. Services such as Otter and Rev deliver highly accurate transcripts—and they’re often able to understand foreign accents better than humans. In addition, journalists, attorneys, medical professionals and others require transcripts of audio recordings. NLP can deliver results from dictation and recordings within seconds or minutes.

International translation. NLP has revolutionized interactions between businesses in different countries. While the need for translators hasn’t disappeared, it’s now easy to convert documents from one language to another. This has simplified interactions and business processes for global companies while simplifying global trade.

Scoring systems. Natural language is used by financial institutions, insurance companies and others to extract elements and analyze documents, data, claims and other text-based resources. The same technology can also aid in fraud detection, financial auditing, resume evaluations and spam detection. In fact, the latter represents a type of supervised machine learning that connects to NLP.

Market intelligence and sentiment analysis. Marketers and others increasingly rely on NLP to deliver market intelligence and sentiment trends. Semantic engines scrape content from blogs, news sites, social media sources and other sites in order to detect trends, attitudes and actual behaviors. Similarly, NLP can help organizations understand website behavior, such as search terms that identify common problems and how people use an e-commerce site. This data can lead to design and usability changes.

Software development. A growing trend is the use of natural language for software coding. Low-code and no-code environments can transform spoken and written requests into actual lines of software code. Systems such as Amazon’s CodeWhisperer and GitHub’s CoPilot include predictive capabilities that autofill code in much the same way that Google Mail predicts what a person will type next. They also can pull information from an integrated development environment (IDE) and produce several lines of code at a time.

Text and image generation. The OpenAI codex can generate entire documents, based a basic request. This makes it possible to generate poems, articles and other text. Open AI’s DALL-E 2 generates photorealistic images and art through natural language input. This can aid designers, artists and others.

Also see: Best Data Analytics Tools

What Ethical Concerns Exist for NLP?

Concerns about natural language processing are heavily centered on the accuracy of models and ensuring that bias doesn’t occur. Many of these deep learning algorithms are so-called “black boxes,” meaning that there’s no way to understand how the underlying model works and whether it is free of biases that could affect critical decisions about lending, healthcare and more.

There is also debate about whether these systems are “sentient.” The question of whether AI can actually think and feel like a human has been expressed in films such as 2001: A Space Odyssey and Star Wars. It also reappeared in 2022, when former Google data scientist Blake Lemoine published human-to-machine discussions with LaMDA. Lemoine claimed that the system had gained sentience. However, numerous linguistics experts and computer scientists countered that a silicon-based system cannot think and feel the way humans do. It merely parrots language in a highly convincing way.

In fact, researchers who have experimented with NLP systems have been able to generate egregious and obvious errors by inputting certain words and phrases. Getting to 100% accuracy in NLP is nearly impossible because of the nearly infinite number of word and conceptual combinations in any given language.

Another issue is ownership of content—especially when copyrighted material is fed into the deep learning model. Because many of these systems are built from publicly available sources scraped from the Internet, questions can arise about who actually owns the model or material, or whether contributors should be compensated. This has so far resulted in a handful of lawsuits along with broader ethical questions about how models should be developed and trained.

Also see: AI vs. ML: Artificial Intelligence and Machine Learning

What Role Will NLP Play in the Future?

There’s no question that natural language processing will play a prominent role in future business and personal interactions. Personal assistants, chatbots and other tools will continue to advance. This will likely translate into systems that understand more complex language patterns and deliver automated but accurate technical support or instructions for assembling or repairing a product.

NLP will also lead to more advanced analysis of medical data. For example, a doctor might input patient symptoms and a database using NLP would cross-check them with the latest medical literature. Or a consumer might visit a travel site and say where she wants to go on vacation and what she wants to do. The site would then deliver highly customized suggestions and recommendations, based on data from past trips and saved preferences.

For now, business leaders should follow the natural language processing space—and continue to explore how the technology can improve products, tools, systems and services. The ability for humans to interact with machines on their own terms simplifies many tasks. It also adds value to business relationships.

Also see: The Future of Artificial Intelligence

The post What Is Natural Language Processing? appeared first on eWEEK.

What Are Neural Networks?

Samuel Greengard — Wed, 09 Nov 2022 23:37:49 +0000

A key element in artificial intelligence, artificial neural networks (ANNs) operate in a manner similar to the human brain. They mimic the way actual biological neurons function in order to find answers for complex computing questions and challenges. The method, which can include millions of artificial neurons, falls under the umbrella of machine learning. It produces mathematical algorithms that are widely used to recognize patterns and solve complex problems in science and business.

ANNs, which are also referred to as simulated neural networks (SNNs), use circuits to signal each other. Typically, a neural network relies on nodes that are comprised of an input layer, one or multiple hidden layers, and an output layer. Individual nodes are assigned a weight and threshold. When a threshold is activated, data flows through it and connects to other nodes or layers. If data doesn’t trigger a response, the system typically ignores it.

A neural network uses training data to recognize complex and often hidden patterns and develop algorithms. Over time and with more data, its accuracy improves. As a result, this machine learning technique produces computer algorithms that are valuable for an array of tasks, such as speech recognition, language translation, image recognition, robotics behavior and many other areas of artificial intelligence (AI).

Deep learning systems—a type of unsupervised machine learning—are increasingly used with neural networks. They’re called “deep learning” because they contain large numbers of neural layers. Using different approaches, data scientists can perform complex tasks that lie outside the ability or scope of humans. Moreover, these systems can continually advance and evolve as new data appears.

Also see: What is Artificial Intelligence

How and Why are Neural Networks Used?

The growing volume of data generated by computers contains answers for numerous questions and problems. Some industry sources report that upwards of 2.5 quintillion bytes of data are generated daily, and more than 100 zettabytes of data exist globally. This includes structured and unstructured data from databases, sales management systems, sensors, financial systems, blogs, social media, audio and video, text and logs, and spreadsheet files, among others.

As a result, deep learning systems based on neural nets are widely used by governments, businesses, researchers and others to mine this data. Consulting firm Gartner reports that more than 80% of data scientists now use ANNs, and natural language generation fueled by deep learning is now part of 90% of modern BI and analytics platforms. Common areas for use include life sciences, manufacturing, banking, retail and in the public sector.

For example, healthcare companies use neural nets to handle tasks such as predictive diagnostics, biomedical imaging and health monitoring. Financial services firms rely on it to detect fraud, oversee credit analysis and automate advisory services. Retailers tap deep learning for marketing, chatbots and augmented reality that’s increasingly used on smartphone apps. Manufacturers rely on the technique for machine vision systems that spot defects and safety violations, and also to automate supply chains and forecast demand.

Another common use case is smart city initiatives. For instance, neural nets might ingest image data from wireless cameras and the machine learning system subsequently learns how to adapt traffic signals and other systems to optimize traffic flow in real-time. This approach is far less expensive than installing sensors and in pavement. These systems—often incorporating connected Internet of Things (IoT) sensors and devices—can also improve the performance of energy systems and provide other advanced automation, as well as enhance security features.

Also see: AI vs. ML: Artificial Intelligence and Machine Learning

A Brief History of Artificial Neural Nets

The origins of artificial neural networks dates back to 1943. At that time, Warren McCulloch and Walter Pitts, who both worked in the field of neuroscience and computing, introduced a computational model that used algorithms called threshold logic. The model relies on a logical gate or a basic building block, in this case an artificial neuron, to build a larger computational framework.

Over the 1940s and 1950s, researchers continued to explore artificial neural network models. In 1958, Frank Rosenblatt placed a perceptron, an algorithm for supervised learning, in an actual computing device. By the 1960s, the first functional models with multiple layers began to emerge, and in 1975, researcher Kunihiko Fukushima developed the first multilayered neural network. In the 1980s modern machine learning capabilities began to emerge. Far more powerful computing power and advanced in the field over the following three decades have resulted in far more powerful capabilities.

One of the key developments in the field was the introduction of graphical processing units (GPUs) in 2010. These systems deliver significant speed and performance advantages, including the ability to reduce errors through greater fine tuning across layers of a model. In fact, GPUs introduce features and capabilities that otherwise aren’t possible in deep learning. Today’s neural nets use several techniques and models to tackle increasingly complex tasks that in some cases exceed human capabilities.

Also see: Top AI Software

How Do Neural Nets Work?

The basis of a neural net is an artificial neuron. Neurons are placed into three different types of layers:

Input layer: The input layer ingests data and converts it into binary code that the computer can utilize.
Hidden layer: The hidden layer performs mathematical computations on the data using non-linear processing techniques that work across the various hidden layers of the model. This “weighting” process develops a hierarchical mathematical framework.
Output layer: The output layer provides the algorithm that’s used for AI.

One way to think about neural networks is that each individual node operates its own linear regression model, which includes input data, weights, a bias (or threshold), and an output, according to IBM. Once the system identifies an input layer, it assigns weights that determine the value of any given variable. These inputs are built into a mathematical model. When an output meets the desired critical threshold, it activates the node, relaying the data to the next node in the network. This process continues until an algorithm exists.

In some cases, data flows through an artificial neural network in one direction, from the input phase to the output phase. However, it’s possible to use other techniques, such as backpropagation, to study data from output back to input.

This approach makes it possible to improve error detection and reduce biased or inaccurate results. Using this technique, it’s possible to change weightings and adapt and adjust a deep learning model as needed.

What Types of Artificial Neural Nets Exist?

While neural nets loosely reflect the way the human brain works, they have become more specialized over time. Today, four primary types of artificial neural nets exist. Each has advantages and disadvantages based on the intended purpose and real-world use case. These include:

Convolutional neural networks (CNNs): These machine learning systems are commonly used for machine vision, object detection, image classification and certain types of forecasting. A CNN incorporates five distinct layers: input, convolution, pooling, fully connected and output. These systems require enormous processing power—typically supplied by GPUs.

Recurrent neural networks (RNNs): This type of ANN framework typically uses time-series data and other sequential data to produce probabilistic models. In other words, the inputs aren’t independent of one another. This makes it ideal for tasks such as natural language processing, speech recognition, sentiment analysis and text related applications.

Feedforward neural networks (FNNs): A descendent of recurrent neural networks, FNN’s do not use any type of cycle or loop to process data and develop a model. Instead, data flows in one direction only—forward from the input nodes and through any hidden nodes en route to the output nodes. Consequently, FNNs are often used for supervised learning tasks such as digital marketing and sales.
Autoencoder neural networks: These unsupervised machine learning systems, sometimes referred to as Autoassociators, ingest unlabeled inputs, encodes data, and then decodes the data as it attempts to pinpoint and extract the most valuable information. The method is designed to reduce data noise. A popular use for this methodology is detecting fraud.

Also see: The Future of Artificial Intelligence

How Do Data Scientists Use Neural Nets for Training?

After a data scientist has identified a use case for an ANN and selected a specific approach, the next step is to put the system into motion. There are two basic approaches that data scientists use: supervised learning and unsupervised learning.

Supervised learning

As the name implies, a human oversees this type of machine learning system. The operator labels datasets to help train the algorithm so that it can classify data and predict outcomes accurately.

For example, a human might label photos of different types of cats—lions, tigers, jaguars, leopards, mountain lions, bobcats, ocelots and housecats—so a system can learn to differentiate them. Casual users may handle this task unknowingly when they tag email as spam, for instance. Supervised learning often plays a role in object recognition, predictive analytics and sentiment analysis.

Unsupervised learning

These systems learn from data that hasn’t been classified or tagged by a human. Simply put, the system finds the patterns and builds an algorithmic model on its own—hence the name “unsupervised learning.”

In some cases, data scientists also use semi-supervised learning, which combines the two approaches, and reinforcement learning, which builds results using a computer program that receives positive and negative “rewards” as it pursues a model.

Also see: The History of Artificial Intelligence

How Are Artificial Neural Nets Impacting the Enterprise?

Among the more common uses for ANNs is predictive analytics. A growing number of enterprise software platforms—including cloud frameworks—include machine learning, deep learning and other tools that help build advanced AI models.

This type of predictive analytics is often used for tasks such as delivering more targeted content to customers, understanding credit limits and approvals, building chatbots and other natural language tools, and delivering recommendations for eCommerce, social media and streaming media.

But predictive analytics is also making its presence felt in many other sectors. For instance, in healthcare, AI-enhanced software guides doctors and other practitioners to relevant outcomes by suggesting drugs and treatments. In manufacturing, machine vision helps manufacturers detect errors and imperfections that would escape the human eye. In fleet management and logistics, software determines routing and how to optimize equipment and fuel—adapting in real-time to weather or traffic. In cybersecurity, neural nets are increasingly used to detect malware and other suspicious behavior on a network.

These systems are also filtering into everyday use in software development and business. In many cases, enterprise applications include no-code or low-code drag and drop interfaces that let users assemble AI and ML tasks. AI-generated software systems such as OpenAI Codex, IBM’s Project Wisdom, Amazon’s CodeWhisperer and GitHub’s Copilot are also moving into the mainstream. They are trained on huge datasets, and they are able to generate code from natural language input.

Tapping cloud computing resources, these systems handle a growing array of tasks—from building chatbots and digital marketing systems to building automation into various tasks. However, Forrester warns that gains don’t happen without the right technology platform. It’s critical to invest in systems that support advanced machine learning and deep learning. This often involves clouds that supply powerful GPUs.

Also see: Best Machine Learning Platforms

What Ethical and Legal Concerns Exist?

One problem with neural networks is that the information they provide is only as good as what’s fed into the system. In addition to the possibility of winding up with a poorly performing system, researchers have found numerous cases of implicit bias, which can result in gender or racial discrimination.

This can cause problems—including legal repercussions—for insurance companies, healthcare providers, financial services firms and government agencies. As a result, businesses should carefully weigh ethical and legal concerns before applying decision making using neural networks and deep learning.

What is the Future of Artificial Neural Networks?

Increasingly powerful computers and faster GPUs promise to push ANNs and deep learning forward. In the coming years, these systems will drive advances in a diverse array of areas, including predictive analytics; autonomous vehicles; swarm robotics; pharmaceutical research, predictive medicine; personal assistants and chatbots; cybersecurity; software development; and manufacturing and supply chain automation.

As more data accumulates—including IoT sensor data and edge computing advances—new use cases will appear as well.

Also see: Data Analytics Trends

The post What Are Neural Networks? appeared first on eWEEK.

What Is Machine Learning?

Samuel Greengard — Tue, 18 Oct 2022 20:15:47 +0000

The term machine learning (ML) refers to the use of advanced mathematical models—typically referred to as algorithms—to process large volumes of data and gain insight without direct human instruction or involvement.

ML is a subset of artificial intelligence (AI). It is built on artificial neural networks (ANNs) or simulated neural networks (SNNs)—essentially node layers that interact and interconnect. It includes a specialized type of machine learning called deep learning (DL).

Machine learning mimics the way humans learn. It spots patterns and then uses the data to make predictions about future behavior, actions and events. In addition, ML constantly uses new data to adapt and change its actions. This ability to learn from experience separates it from more static tools such as business intelligence (BI) and conventional data analytics.

Organizations across numerous fields are turning to ML to address complex business challenges. The technology is particularly valuable in areas such as marketing and sales, financial services, healthcare, retail, energy, transportation and government planning. High profile examples of organizations using machine learning include Netflix, Uber, Google, Facebook and Amazon. The technology handles tasks as diverse as pricing, delivery times, search results and product recommendations.

Depending on the use case, ML requires specific training methods in order to function effectively—and deliver value. These approaches include supervised and unsupervised learning, which means the system learns with humans overseeing it or on its own.

Today, machine learning is used for tasks as varied as speech recognition, image detection and machine vision, predicting customer behavior, spotting fraud and cybersecurity threats and overseeing machine maintenance.

Also see: Best Machine Learning Platforms

How are Machine Learning Methods Used?

Businesses, governments, educational institutions and many other entities rely on ML to deliver guidance and make key decisions. In many cases, ML system are incorporated into broader automation and AI frameworks. This might include a smart transportation system that automatically adapt to conditions, such as weather, traffic and other events.

Another example is sentiment analysis, which plugs in different data—historical buying patterns, current data about raw materials and pricing, weather conditions, social media trends and more—to generate a model that predicts future pricing and buying, even under specific conditions.

In addition, ML is now used to develop and improve performance many ways. ML will:

Enhance smart speakers and personal assistants on smartphones.
Detect unsafe behavior in factories.
Allow airline passengers to board planes and go through passport control using biometrics.
Develop robots, digital twins and other business tools that continually learn and improve as data is added.

Consulting firm Gartner reports that top use cases revolve around five core areas: knowledge management, virtual assistants, autonomous vehicles, the digital workplace, and crowdsourced data. Adoption is accelerating rapidly as digital transformation becomes a growing focus.

Worldwide artificial intelligence (AI) software revenue, including machine learning, is forecast to total $62.5 billion by the end of 2022, an increase of 21.3% from 2021, it noted.

Also see: Top AI Software

A Brief History of Machine Learning

The idea that machines could learn and adapt their algorithms was introduced by logician Walter Pitts and neuroscientist Warren McCulloch, who published a research paper outlining the concept in 1943.

In 1950, computer scientist Alan Turing introduced the Turing Test, also referred to as the “imitation game,” a framework that gauges a machine’s ability to display intelligent behavior indistinguishable from humans.

The words machine learning were coined by IBM data scientist Arthur L. Samuel in 1959. In an academic paper, he promoted the idea that a computer could learn to play checkers and compete with humans. Samuel developed an algorithm that learned to play the game without explicit programming. In 1962, a checkers master, Robert Nealey, played against an IBM 7094 computer and lost.

Over the last 60 years, the ML frameworks have grown and expanded. Far greater computational power along with new and different types of statistical methods, or algorithms, have led to radical advances in the field.

As ML has evolved, explanation-based learning has been replaced by neural nets and deep learning methods that are less explainable. In 2009, the emergence of convolutional neural nets (CNN) revolutionized the field. In 2011, IBM’s Watson, a CNN, beat human competitors in the television show Jeopardy.

These CNNs process multiple layers of data—much like the human brain. CNNs handle mathematical learning and computational processes behind the scenes on their own and allow filtering and tuning in real time. Today, CNNs are used for advanced tasks such as facial recognition and live language translation. Companies such as Netflix, Google, Apple and many others used CNNs and their cousin, Generative Adversarial Networks (GAN) to handle increasingly complex ML and AI tasks.

How Do Machine Learning Systems Work?

Four primary types of ML methods exist:

Supervised learning, which requires a person to identity the desirable signals and outputs through labeling or classification.
Unsupervised learning, which allows the system to operate independent of humans and find valuable output using unlabeled data.
Semi-supervised learning, which combines the two methods above.
Reinforcement learning, which incorporates a computer program that interacts with a dynamic environment to achieve specific goals and outcomes.

Machine learning frameworks use software languages such as TensorFlow and PyTorch to deliver a usable model. According to UC Berkeley researchers, an ML model involves three distinct components:

A Decision Process. A system ingests data and uses a machine learning algorithm to classify and predict events.
An Error Function. This built-in capability allows the model to evaluate the accuracy and quality of predictions.
A Model Optimization Process. This feature allows ML models to adapt, based on finding data that’s a better fit with the training set. As the model changes, the system continues to evaluate and refine itself to achieve accuracy goals.

Also see: AI vs. ML: Artificial Intelligence and Machine Learning

What Types of Machine Learning Frameworks are Used?

Machine learning revolves around several core algorithmic frameworks to achieve results and produce models that are useful. These include:

Neural Networks

These systems are comprised of artificial intelligence algorithms that are designed to simulate the way the human brain thinks. They use training data to spot patterns, and they typically learn rapidly using thousands or even millions of processing notes. They’re ideal for recognizing patterns and they are widely used for speech recognition, natural language processing, image recognition, consumer behavior and financial predictions.

Linear Regression

The technique identifies relationships between independent input variables and at least one target variable. It is valuable for predicting numerical values, such as prices for airline flights or real estate values, usually over a period of weeks or months. It can display predicted price or value increases and decreases across a complex data set.

Logistic regression

This method typically uses a binary classification model (such as “yes/no”) to tag or categorize whether an event is likely to occur. It sorts through a dataset to find weights and biases that can be built into or excluded from the model. For instance, a common use for this technology is identifying spam in email and blacklisting unwanted software code or malware.

Clustering

This ML tool uses unsupervised learning to spot patterns and relationships that humans may overlook. An example of clustering is how a supplier performs for the same product at different facilities. This approach might be used in healthcare, for instance, to understand how different lifestyle conditions impact health and longevity. It can also be used for trend detection at websites and in social media, such as what text, images and video to display.

Decision Tree

The supervised learning approach builds a data structure with nodes that test an idea or concept against a set of input data. A Decision Tree delivers numerical values but also performs some classification functions. It helps users visually understand data. Unlike other forms of ML, it makes it possible to review and audit results. In the business world, decision trees are often used to develop insights and predictions about downsizing or expanding, changing a pricing model or succession planning.

Random Forest

A Random Forest model incorporates multiple decision tree models simultaneously. Combining decision trees makes it possible to classify categorical variables or the regression of continuous variables—forming what’s called an ensemble. This makes it possible to use different trees to produce specific predictions but then combine the predictions into a single ensemble or overall model. A random forest algorithm ML tool might be used for a recommendation system, for example.

What is Deep Learning?

Neural nets serve as the foundation for deep learning models—which in turn feed many of today’s AI systems. Deep learning systems rely on interconnected layers of machine learning algorithms—typically through graphic processing units (GPUs)—to develop and continuously evolve a model. While a more basic neural net incorporates one or two hidden layers, a DL model may include dozens, hundreds or even thousands of layers.

For instance, if a deep learning system is trained on birds, it learns how to distinguish eagles from hawks, crows from ravens, and chickadees from hummingbirds. Deep learning frameworks are fast and tend to deliver sophisticated models that improve automation systems and tools such as Apple’s Siri or Amazon’s Alexa, TV remotes, credit card fraud detection, captions for YouTube videos and autonomous vehicle behavior.

Also see: The Future of Artificial Intelligence

How do ML and AI Differ?

Unfortunately, ML and AI are often used synonymously. Attempting to distinguish between the two fields can be difficult, partly because they overlap. However, a starting point is recognizing that ML is always a subset of AI.

In a broad sense, artificial intelligence attempts to simulate human thinking and behavior. Machine learning specifically relates to systems that learn about conditions through data without a human interface and then apply the data to decision-making and other events, such as automation.

Also see: AI vs. ML: Artificial Intelligence and Machine Learning

How is ML Evolving in the Enterprise?

As Gartner noted, ML adoption is growing rapidly. The technology is increasingly incorporated into enterprise software applications, smartphone apps, and it is available as a discreet service through cloud platforms from the likes of AWS, Google, Microsoft and others.

Tools are becoming easier to use—in many cases they’re now available in low-code and no-code platforms—thus expanding their availability to line of business users as well as data scientists. Yet, no matter how sophisticated ML platforms become, they require human oversight, including a strategic focus on how to use them effectively. If an ML tool is poorly constructed or an organization feeds it with low quality data, the results can be useless and even damaging.

Also see: The History of Artificial Intelligence

What Ethical and Legal Concerns Exist?

Bias is a growing concern with ML systems. Depending on underlying data used for training, they can generate discriminatory and bias results.

In recent years, these systems have been associated with hiring bias and overall gender and racial bias—including among law enforcement and government agencies. The data can also be misused when it is fed into broader AI systems that generate false news and touch on areas such as surveillance, robotics, marketing and advertising.

Since there’s virtually no regulation of AI, organizations should have a team overseeing ML and AI ethics policies and data privacy standards internally. It’s also important to tune into broader Ethical AI trends in the business world.

What is the Future of Machine Learning?

Rapid advancement of ML technology ensures that it will play an increasingly prominent role in defining business in the years to come. It will impact agriculture, finance, manufacturing, transportation, marketing, customer support, cybersecurity and many other areas. Machine learning will also help drive corporate Environmental, Social, and Governance (ESG) programs and sustainability initiatives. These initiatives will affect sourcing, supply chains and Scope 3 emissions that extend back to raw materials and component providers.

Machine learning systems are becoming easier to use and manage. As a result, they are extending deeper into organizations and moving beyond the realm of data scientists. As organizations looks to trim costs, boost productivity, oversee ESG programs, build smart factories, better manage supply chains and fuel innovation at scale, ML emerges as an essential tool.

Savvy business and IT leaders now look for ways to adopt and expand the use of machine learning while exploring test cases that could unlock transformative gains in the future.

Also see: What is Artificial Intelligence

The post What Is Machine Learning? appeared first on eWEEK.

Top Data Analytics Tools & Software 2022

Samuel Greengard — Mon, 17 Oct 2022 18:38:36 +0000

Data analytics tools and software deliver deep insights into wide-ranging business events. What data analytics and big data are used effectively, they can fuel faster and better decision-making. This offers significant competitive advantage and boosts digital transformation.

Clearly, data mining using data analytics software is at the center of business success; and of course, using optimal data mining techniques makes all the difference. Yet the volume, variety and velocity of big data keeps on growing—making the task more challenging than ever. Some companies hire data analysts and business intelligence professionals; other have a team of data scientists helping them decode various data sources. In sum, having the right data analytics software tools is part of the solution; having skilled big data experts is just as important.

Also see: Top Business Intelligence Software

Like most software, selecting the right data analytics platform is critical. Ensuring that a data analytics software connects and interacts with other data sources is essential. It’s critical for monitoring both your edge computing deployment and cloud provider. Likewise, monitoring data in motion across an enterprise and out to a supply chain is increasingly important.

What Are Features and Benefits of Data Analytics Software?

Best-in-class big data analytics solutions offer numerous features and capabilities for making sense of data at all levels. These include real-time visualizations, machine learning and AI capabilities and, in some cases, digital twins. They are powerful tools used by data scientists, data analysts, and other business intelligence pros. Understanding what precisely a data solution delivers is vital as organizations look to build out more expansive and complex data frameworks.

How to Choose Data Analytics Tools
Top Data Analytics Tools & Software
IBM
Microsoft
MicroStrategy
Qlik
SAP
SAS
Sisense
Tableau
ThoughtSpot
TIBCO
Apache
Google
Looker
Additional Market Leaders: Data Analytics Tools
Vendor Comparison Chart: Top Data Analytics Software

Also see: Top Data Mining Tools

How Do You Choose Data Analytics Software?

Selecting a platform for data analysis isn’t a job for the faint of heart. There are numerous factors, features and frameworks to consider. Here are three critical steps that will guide you to the right decision:

- Analyze your needs – with an eye toward your particular staff. The process starts with an evaluation of what your organization’s data requirements are and what objectives it has. It’s important to look at who will be using the data – line of business users versus data scientists, for example; how widely used do you expect the platform to be? Does it need to be robust enough for a team of data analysts? Additionally, understand what data sources are required to build models and insights, and what type of analytics is required: data visualizations, statistical analysis, predictive analytics or other specialized needs.
- Review vendors – think scalability. It’s essential to know whether a big data analytics solution can manage data effectively and consistently deliver the desired results. There’s also the issue of scalability. As supply chains and business partnerships expand, data analytics tools and business intelligence applications must be equipped to ingest data from new sources, process this data effectively and produce actionable information and results. Do you need a data warehouse? A date warehouse helps organize, prepare and data mine your data sources – it’s an integral part of an advanced business intelligence solution.
- Select a solution – with the understanding that switching is hard. Changing platforms is expensive and extraordinarily complicated. Consequently, it’s important to match your organization’s needs with the right solution provider on the first try; it’s worth more homework upfront. Key factors in selecting a vendor include speed, performance, user interface (UI) and usability (UX), flexibility, scalability, security, the vendor’s roadmap, and the vendor’s commitment to support. Pay particular attention to the vendor’s service level agreement (SLA). In the end, the upfront price isn’t as important at total cost of ownership (TCO).

Also see: Top Data Visualization Tools

Top Data Analytics Tools & Software

Here are 10 of the top big data analytics software providers:

IBM

Key Insight: Big Blue offers a wide array of data analytics solutions and tools. However, Cognos Analytics with Watson is a leader in delivering insights through data visualizations. It taps the Watson AI and machine learning engine to blend data and deliver broad and deep insights. The platform offers natural language processing and contextual forecasting, including predictive analytics. It also includes integration with social media platforms.

Pros

- - Powerful ad hoc reporting tools.
  - Advanced AI through the Watson platform.
  - Suitable for line of business users as well as data scientists.
  - Strong compliance and security features, including single sign on and object level security.
  - On premises and cloud options available.

Cons

- - Better suited to existing IBM customers; the platform can be difficult to integrate with outside data tools.
  - The analytics dashboard and reporting functions is geared for pro users.
  - Large footprint that consumes significant resources.

Microsoft

Key Insight: Power BI is an analytics software platform optimized for Azure cloud. It delivers rich data visualizations through a highly scalable self-service model. The platform supports end-to-end business solutions by connecting Power BI with other Microsoft Power Platform Products—and to Microsoft 365, Dynamics 365, Azure, and hundreds of other apps. It is ranked as a Leader by both Gartner and Forrester.

Pros

- - A top performing platform for AI and ML.
  - Strong data ingestion engine and data management functions.
  - Superior data visualizations.
  - Enormous user base translates into frequent updates and strong community support.

Cons

- - Difficult to use with non-Microsoft tools and applications.
  - Can have a steep learning curve.
  - Not a good fit for some mobile platforms and devices.
  - Premium tier is expensive.

MicroStrategy

Key Insight: The vendor bills its BI and Analytics platform as a way to embed “intelligence everywhere.” It connects more than 200 data sources—including top platforms like Snowflake—en route to real-time visualizations for both PCs and Macs. It supports location-based analysis and delivers self-service dashboards that can be used for sophisticated drill-down analysis.

Pros

- - Powerful engine integrates with most major data platforms through a robust set of APIs.
  - Strong support for mobile devices.
  - Solid security features are built into the platform.
  - Specialized templates and tools for vertical industries such as finance, healthcare, retail, tech and government.

Cons

- - Interface can be challenging.
  - Unstructured data can be difficult to integrate.
  - User base isn’t as large as other vendors.

Qlik

Key Insight: A longtime vendor in the BI and data analytics space, Qlik offers a moderately priced solution, Qlik Sense, that delivers robust functionality on-premises or in the cloud. It ties together existing databases and data sources and provides self-service visualizations and reporting that can be used across different groups, departments and functions. The platform incorporates AI and ML to deliver active intelligence.

Pros

- - The platform offers a strong dashboard and easy-to-use tools.
  - Highly scalable and flexible analytics capabilities.
  - Capable of handling large volumes of data.
  - Supports multi-cloud infrastructures; includes strong governance features.
  - Integration with numerous other data tools, including Tableau and Power BI.

Cons

- - May require customization and third-party extensions.
  - Lacks some key reporting and exporting capabilities.
  - Lower vendor profile and a smaller user base means less community-based support.

SAP

Key Insight: SAP’s presence in the enterprise application space makes it a good choice for organizations already on the vendor’s platform. SAP Analytics Cloud delivers a streamlined solution with advanced predictive analytics and planning functions. It delivers powerful self-service visualization and simulation tools, real-time insights and integration with numerous outside data sources.

Pros

- - Delivers a cloud native platform.
  - Powerful dashboard delivers broad and deep insights into data.
  - Supports numerous types of analysis, including visualizations, predictive analytics, augmented analytics and statistical analysis.
  - Offers strong AI and ML capabilities.

Cons

- - Can be complex and difficult to set up.
  - No on-premises solution.
  - Expensive, particularly for small- and medium-size organizations.
  - Limited support for off-premises applications running on desktops and mobile devices.

SAS

Key Insight: A pioneer in the data analytics software space, SAS offers a sophisticated framework for data analytics. This includes numerous applications that address different requirements. Its visual analytics solution is among the most advanced available, offering a sophisticated dashboard, a low code framework and AI/ML. It connects to numerous data sources, performs interactive data discovery and accommodates augmented analytics, chat-enabled analytics, location analytics and much more.

Pros

- - Fast and efficient data processing, including strong AI and ML capabilities.
  - Flexible low-code environment for building mobile apps.
  - Powerful security, administration and governance features.
  - Drag-and-drop interface is easy to use.
  - Flexible and highly scalable.
  - Large user base.

Cons

- - Potentially expensive and difficult to learn.
  - Some users desire expanded customization capabilities.
  - Installation and initial setup can be difficult.

Sisense

Key Insight: The vendor’s data analytics capabilities are among the most sophisticated, and the solution is designed primarily for data scientists, analysts and power business users. The self-service cloud platform connects cloud and on-premises data and includes advanced functionality such as AI and ML. It incorporates low-code and no-code tools and supports numerous types of output, including predictive analytics and visualizations.

Pros

- - Robust APIs and strong data discovery capabilities.
  - Fast performance and intuitive interface with drag and drop capabilities.
  - Highly customizable.
  - Highly rated customer support.

Cons

- - Better for power users. Can be difficult to set up, learn and use.
  - Expensive

Tableau

Key Insight: The widely popular data analytics solution, now part of Salesforce, delivers excellent and highly interactive visual dashboards in real time. It connects to a wide range of data sources, handles discovery and data ingestion deftly, and taps AI and ML to deliver an easy-to-use solution that’s ideal for line of business users but sophisticated enough for data scientists. Not surprisingly, there’s a strong focus on CRM, though the solution is suitable for different tasks across a wide range of industries.

Pros

- - A powerful and highly flexible framework produces outstanding dashboards and visualizations.
  - Extremely intuitive UI.
  - Large user base translates into strong community support.
  - Excellent integration with Salesforce CRM.

Cons

- - Expensive, particularly for smaller organizations.
  - Some user complaints about customer service and support.
  - Lack of direct integration with AWS S3.

ThoughtSpot

Key Insight: The vendor focuses on an approach it calls “search and AI-driven analytics.” The cloud-based solution delivers an appealing front end for data. It offers powerful tools for discovering, ingesting, connecting and managing data—through APIs and AI/ML. ThoughtSpot embeds search and insight-driven actions into apps using a low-code developer-friendly platform. It supports non-technical users and delivers a single source of truth, with robust security and governance.

Pros

- - Supports numerous data types and provides numerous and flexible report templates.
  - A powerful Google-like search engine and accompanying AI/ML supports complex natural language queries and questions.
  - Delivers rich and flexible visualizations.
  - Ideal for non-technical users.

Cons

- - Performance may lag on extremely large data sets.
  - Some users complain about the lack of tutorials and customer support.
  - Some multi-tenant/multi-cloud features and support are lacking.

TIBCO

Key Insight: Tibco has a solid reputation in the BI and analytics arena. Spotfire delivers real-time data visualization through NLQ powered search, AI-driven recommendations, and direct manipulation. It supports both on-premises and cloud frameworks, with a powerful and highly scalable analytics engine. The result is immersive dashboards, predictive analytics, geolocation analytics, and streaming analytics. Spotfire Mods allows organizations to build custom analytics apps.

Pros

- - Includes more than 60 built-in connectors and support for almost every data type through customer APIs.
  - Strong AI engine generates recommended visualizations on the fly.
  - Handles extremely large data set well.
  - Delivers tight coding integration through python and R.

Cons

- - User interface isn’t particularly intuitive and drag-and-drop features are sometimes absent.
  - Customizations can be difficult.
  - User community is smaller than competitors.

Apache Software Foundation

Key Insight: Apache Spark is a software framework designed specifically for Big Data. Built on Hadoop, Spark operates in a similar way, in that it distributes heavy analytics workloads across many computers. Spark operates in-memory rather than saving to disk like Hadoop, so performance is considerably faster and Spark is often used for real-time analytics. It has a rich library of machine learning algorithms called MLlib.

Pros:

- - Free.
  - Fast, in-memory real-time analytics.
  - ML library.
  - Easy to use.

Cons:

- - No file management system.
  - Requires Hadoop and other Apache apps.
  - Tigid user interface.

Google

Key Insight: Google Data Studio offers considerable simplicity of use in even the free version. In addition to its ease of use, it offers connectivity with other Google cloud applications and makes it easy to collaborate with both internal and external colleagues. It comes with data templates and a drag=and-drop interface for rapidly building analytics applications and a wide array of data source connections.

Pros:

- - Enables collaboration.
  - Easy to use interface.
  - Many templates for building apps.
  - Rich reporting.

Con:

- - Can be buggy.
  - Not as scalable as some competing products.

Looker

Key Insight: Technically, Looker Data Sciences is a Google product and is a part of Google Cloud Platform, but still operates as its own data analytics platform. Looker can load data from any SQL source and has unique data modeling layers to make data available to anyone in real-time. The Looker API provides a secure, “RESTful” interface to create custom applications and data-driven workflows. Looker offers developers the option of adding embedded analytics to their applications, websites, and portals. This lets them put analytics in external-facing applications to share insight with partners and customers.

Pro

- - Great for cleaning and manipulating data.
  - Google owned but not tied to Google Cloud.
  - Supports a wide range of data sources.

Con

- - No on-prem support.
  - Can get slow with large data sets.

Data Analytics Tools: Additional Market Leaders

Google Charts

Google offers a free data visualization tool that works with JavaScript to generate presentations and reports.

OpenText

OpenText provides a variety of tools for generating data insights across a variety of vertical industries, including finance, automotive, healthcare and energy.

Birst

The vendor aims to deliver meaningful data insights from the boardroom to the shop floor. It focuses on pre-built industry and role-specific content and metrics.

Domo

A “BI for All” framework is at the center of Domo’s solutions. It supports strong data integration, BI and analytics, intelligent apps and embedded analytics.

Zoho

The self-service tool delivers robust visualizations via intuitive dashboards. Powerful connectors pull together a multitude of data types and formats.

Xplenty

The platform handles ETL and reverse-ETL functionality within a highly scalable platform. It delivers strong compliance and security features.

KNIME

The free, open-source data analytics solution delivers data integration, modeling and visualization capabilities.

HubSpot

The CRM data platform focuses on marketing and customer insights. It features an appealing interface and robust analytics tools.

RapidMiner

The advanced analytics platform taps machine learning and AI to generate a wide variety of data insights, including predictive analytics.

Yellowfin

The vendor focuses on appealing dashboards to promote digital storytelling. The solution incorporate powerful natural language capabilities.

Data Analytics Software Tools: Vendor Comparison Chart

Data Analytics Tool	Pros	Cons
IBM Cognos Analytics	Advanced visualization tools suitable for IBM customers	Geared for IBM environments
Microsoft Power BI	Extensive capabilities tied to Microsoft products, including Azure	Not ideal for use with non-Microsoft applications and products
MicroStrategy Platform	Excellent choice for connecting data	Expensive, interface can be challenging
Qlik Sense	Powerful and versatile platform with strong AI and ML	Lacks some advanced functionality found in other leading solutions
SAP Analytics Cloud	Powerful BI and analytics capabilities for SAP users	Expensive
SAS Visual Analytic	Sophisticated BI and analytics, with excellent AI and ML	Can be challenging for non-technical users
Sisense Platform	Advanced features and capabilities with robust APIs and top-tier performance	Expensive. Better suited to power users
Tableau	Outstanding UI and UX, with deep Salesforce/CRM integration	Can be pricey
ThoughtSpot	Advanced AI and natural language search deliver powerful analytics capabilities	Performance can lag on extremely large data sets
TIBCO Spotfire	Highly flexible platform that’s ideal for data scientists and power users	Interface can prove challenging, particularly for non-technical users
Apache Software	Software framework designed specifically for Big Data	No file management system
Google	Offers considerable simplicity of use in even the free versions	Can be buggy
Looker	Unique data modeling layers make data available to anyone in real-time	No on-prem support

The post Top Data Analytics Tools & Software 2022 appeared first on eWEEK.

AI vs. ML: Artificial Intelligence and Machine Learning Overview

Samuel Greengard — Wed, 17 Aug 2022 20:28:16 +0000

The idea that machines can replicate or even exceed human thinking has served as the inspiration for advanced computing frameworks – and is now seeing vast investment by countless companies. At the center of this concept are artificial intelligence (AI) and machine learning (ML).

These terms are often used synonymously and interchangeably. In reality, AI and ML represent two different things—though they are related. In essence:

Artificial intelligence can be defined as a computing system’s ability to imitate or mimic human thinking and behavior.

Machine learning, a subset of AI, refers to a system that learns without being explicitly programmed or directly managed by humans.

Today, both AI and ML play a prominent role in virtually every industry and business. They drive business systems and consumer devices. Natural language processing, machine vision, robotics, predictive analytics and many other digital frameworks rely on one or both of these technologies to operate effectively.

Also see: What is Artificial Intelligence

Brief History of AI and ML

The idea of building machines that think like humans has long fascinated society. During the 1940s and 1950s, researchers and scientists, including Alan Turing, began to explore the idea of creating an “artificial brain.” In 1956, a group of researchers at Dartmouth College began to explore the idea more thoroughly. At a workshop held at the university, the term “artificial intelligence” was born.

Over the following few decades, the field advanced. In 1964, Joseph Weizenbaum in the MIT Artificial Intelligence Laboratory invented a program called ELIZA. It demonstrate the viability of natural language and conversation on a machine. ELIZA relied on a basic pattern matching algorithm to simulate a real-world conversation.

During the 1980s, as more powerful computers appeared, AI research began to accelerate. In 1982, John Hopfield showed that a neural network could process information in far more advanced ways. Various forms of AI began to take shape, and the first artificial neural network (ANN) appeared in 1980.

During the last two decades, the field has advanced remarkably, thanks to enormous gains in computing power and software. AI and now ML is now widely used in a wide array of enterprise deployments. These technologies are used in natural language systems like Siri and Alexa, autonomous vehicles and robotics, automated decision-making systems in computer games, recommendation engines like Netflix, and extended reality (XR) tools, such as virtual reality (VR) and augmented reality (AR).

Machine learning in particular has flourished. It is increasingly used by government entities, businesses and others to identify complex and often elusive patterns involving statistics and other forms of structured and unstructured data. This includes areas as diverse as epidemiology and healthcare, financial modeling and predictive analytics, cybersecurity, chatbots and other tools used for customer sales and support. In fact, many vendors offer ML as part of cloud and analytics applications.

Also see: Best Machine Learning Platforms

What Is the Impact of Artificial Intelligence?

A machine’s ability to emulate human thinking and behavior profoundly changes the relationship between these two entities. AI unleashes automation at scale and enables an array of more advanced digital technologies and tools, including VR, AR, digital twins, image and facial recognition, connected devices and systems, robotics, personal assistants and a variety of highly interactive systems.

This includes self-driving cars that navigate real-world conditions, smart assistants that answer questions and switch lights on and off, automated financial investing systems, and airport cameras and facial recognition. The latter includes biometric boarding passes airlines use at departure gates and the Global Entry system that requires only a face scan to pass through security checkpoints.

Indeed, businesses are putting AI to work in new and innovative ways. For example, dynamic pricing models used by the travel industry gauge supply and demand in real-time and adjusts pricing for flights and hotels to reflect changing conditions.

AI technology is used to better understand supply change dynamics and adapt sourcing models and forecasts. In warehouses, machine vision technology (which is supported by AI) can spot things like missing pallets and manufacturing defects that are too small for the human eye to detect. Meanwhile, chatbots analyze customer input and provide contextually relevant answers on a live basis.

Not surprisingly, these capabilities are advancing rapidly—especially as connected systems are added to the mix. Smart buildings, smart traffic grids and even smart cities are taking shape. As data streams in, AI systems determine the next optimal step or adjustment.

Similarly, digital twins are increasingly used by airlines, energy firms, manufacturers and others to simulate actual systems and equipment and explore various options virtually. These advanced simulators predict maintenance and failures but also provide insight into less expensive and more sophisticated ways to approach business.

Also see: How AI is Altering Software Development with AI-Augmentation

What Is the Impact of Machine Learning?

Machine learning has also advanced remarkably in recent years. Using statistical algorithms, machine learning unlocks insights that have traditionally been associated with data mining and human analysis.

Using sample data, referred to as training data, it identifies patterns and applies them to an algorithm, which may change over time. Deep learning, a type of machine learning, uses artificial neural networks to simulate the way the human brain works.

These are the primary ways to use ML:

Supervised learning, which requires a person to identity the desirable signals and outputs.

Unsupervised learning, which allows the system to operate independent of humans and find valuable output.

Semi-supervised learning and reinforcement learning, which involves a computer program that interacts with a dynamic environment to achieve identified goals and outcomes. An example of the latter is a computer chess game. In some cases, data scientists use a hybrid approach that combines elements of more than one of these methods.

Also see: The Future of Artificial Intelligence

A Variety of Algorithms

Several types of machine learning algorithms play a key role:

Neural Networks: Neural networks simulate the way the human brain thinks. They’re ideal for recognizing patterns and they are widely used for natural language processing, image recognition and speech recognition.

Linear Regression: The technique is valuable for predicting numerical values, such as predicting prices for flights or real estate.

Logistic regression: This method typically uses a binary classification model (such as “yes/no”) to tag or categorize something. A common use for this technology is identifying spam in email and blacklisting unwanted code or malware.

Clustering: This ML tool uses unsupervised learning to spot patterns that humans may overlook. An example of clustering is how a supplier performs for the same product at different facilities. This approach might be used in healthcare, for instance, to understand how different lifestyle conditions impact health and longevity.

Decision Tree: The approach predicts numerical values but also performs classification functions. It delivers a clear way to audit results, unlike other forms of ML. This method also works with Random Forests, which combine Decision Trees.

Regardless of the exact method, ML is increasingly used by companies to better understand data and make decisions. This, in turn, feeds more sophisticated AI and automation. For example, sentiment analysis plugs in historical data about sales, social media data and even weather conditions to adapt manufacturing, marketing, pricing and sales tactics dynamically. Other ML applications deliver recommendation engines, fraud detection and image classification used for medical diagnostics.

One of the strengths of machine learning is that it can adapt dynamically as conditions and data change, or an organization adds more data. As a result, it’s possible to build an ML model and then adapt it on the fly. For example, a marketer might develop an algorithm based on a customer’s behavior and interests and then adapt messages and content as the customer changes his or her behavior, interests or purchasing patterns.

Also see: Digital Transformation Guide: Definition, Types & Strategy

How are AI and ML Evolving in the Enterprise?

As mentioned, most software vendors—across a wide spectrum of enterprise applications—offer AI and ML within their products. These systems make it increasingly simple to put powerful tools to work without extensive knowledge of data science.

Yet, there are some caveats. For customers, in order to get the most out of AI and ML systems, an understanding of AI and some expertise is often necessary. It’s also vital to avoid vendor hype when selecting products. AI and ML can’t fix underlying business problems—and in some instance, they can produce new challenges, concerns and problems.

What are the Ethical and Legal Concerns?

AI and ML are at the center of a growing controversy—and they should be used wisely—and carefully. They have been associated with hiring and insurance bias, racial discrimination and a variety of other problems, including misuse of data, inappropriate surveillance and things like deep fakes and false news and information.

There’s growing evidence that facial recognition systems are considerably less accurate when identifying people of color—and they can lead to racial profiling. Moreover, there are growing concerns about governments and other entities using facial recognition for mass surveillance. So far, there’s very little regulation of AI practices. Yet Ethical AI is emerging as a key consideration.

What is the Future of ML and AI?

AI technologies are advancing rapidly, and they will play an increasingly prominent role in the enterprise—and our lives. AI and ML tools can trim costs, improve productivity, facilitate automation and fuel innovation and business transformation in remarkable ways.

As the digital transformation advances, various forms of AI will serve as the sun around which various digital technologies orbit. AI will spawn far more advanced natural speech systems, machine vision tools, autonomous technologies, and much more.

Also see: Top Digital Transformation Companies

The post AI vs. ML: Artificial Intelligence and Machine Learning Overview appeared first on eWEEK.

What is IoT? Guide to the Internet of Things

Samuel Greengard — Fri, 22 Jul 2022 18:55:37 +0000

The Internet of Things (IoT) shifts human and computer interaction to a broad and widely distributed framework. By connecting various “things” and “objects”—smartphones, lights, industrial machines, wearables, remote sensors and physical objects that have been equipped with RFID tags—it’s possible to drive advances that would have seemed unimaginable only a couple of decades ago.

The IoT—which serves as a broad term for a vast network of connected devices—has moved into the mainstream of business and life. It now serves as a fabric for far more advanced human-machine interaction. It encompasses everything from home thermostats and wearables to tracking systems and smart systems for agriculture, buildings and even cities.

Today, virtually no technology lies outside the realm of the IoT. Self-driving vehicles, manufacturing robots, environmental monitoring, supply chain tracking, transportation systems, and remote medical devices are just a few of the areas undergoing radical change due to the IoT.

Mobile phone company Ericsson reports that there are currently about 29 billion IoT devices in use worldwide. Businesses are increasingly turning to the IoT to drive innovation, trim costs, improve safety and security, and promote greater sustainability.

Not surprisingly, the global COVID-19 pandemic has further accelerated adoption of IoT technologies. The need for automation and touchless systems has multiplied. These include frameworks that support remote work as well as various health scanners, contact tracing systems, crowdsourcing and digital payments. Used effectively, the IoT introduces systems and frameworks that match or exceed human capabilities.

A Brief History of IoT
How the IoT Works
Data Drives IoT
IoT Means Business
How Secure is IoT?
Ethical and Privacy Concerns Related to IoT
IoT: a Connected World is Taking Shape

A Brief History of IoT

From the earliest days of computing, it has been apparent that connected devices and systems deliver value. They allow machines and humans to interact in more useful and productive ways. The roots of the IoT extend back to 1969, when a group of prominent researchers developed ARPAnet (the precursor to today’s Internet). Over the ensuring three decades, various computing, networking and wireless protocols began to take shape—and serve as a foundation for the IoT.

In 1999, Kevin Ashton from MIT introduced the term “Internet of Things.” At the time, the concept focused heavily on RFID tags that were added to physical objects. However, after the turn of the century, two events proved transformative. In the early 2000’s, Amazon introduced AWS Cloud. It provided an agile and flexible framework for collecting, managing and sharing data. Six thereafter, Apple released the first iPhone. It supported sophisticated apps and enabled always-on real-time connectivity.

By 2008, the IoT reached a milestone. There were more connected devices than people in the world. Over the ensuing years, further developments in sensors, connectivity and artificial intelligence (AI) fueled increasingly sophisticated and easy-to-use IoT solutions. Meanwhile, consortiums and standards appeared, making it easier to maximize the value of a connected world.

Over the last few years, the IoT has matured at a remarkable speed. Sensors and microchips designed specifically for the IoT have appeared, software and AI/machine learning have advanced, and vendors have introduced an array of solutions that deliver more robust capabilities.

Also see: What is Edge Computing

How the IoT Works

Connecting myriad devices, while a seemingly simple concept, is incredibly challenging. IoT devices must function in many different situations and scenarios. They must interact with other digital technologies as well as legacy systems while relying on different communications standards and protocols. In many cases, IoT devices must support multiple standards and protocols.

The foundation for the IoT is Internet Protocol (IP) and Transmission Control Protocol (TCP). These standards—which grew out of ARPAnet and are now part of the Internet and Web—serve as the underlying protocols for establishing a virtual connection between various sensors, devices and systems.

IoT devices connect to IoT gateways or edge devices that collect data. The modern IoT is comprised of seven basic layers that reside within an Open Systems Interconnection (OSI) model. These include the:

Physical layer
Data links
Network
Transport mechanism
Session layer
Presentation layer
Software applications

The physical and data link layers determine how devices connect to the IoT. This can include cables, Bluetooth or Wi-Fi. The data link layer identifies connected devices through a media access control (MAC) address.

The network layer, also referred to as the Internet layer, routes packets of data to an Internet Protocol (IP) address. Today, Internet Protocol version 6 (IPv6) offers advanced network identification and controls that keeps the IoT running.

The transport layer accommodates end-to-end communication across a group of IoT devices and other systems. It boosts reliability, alleviates congestion and ensures that packets arrive intact and in the correct sequence.

The session, presentation and application layers address cross-application messaging and the exchange of data.

Vendors and others use a wide range of IoT protocols to address the specific needs of each of these layers. These include various communication technologies such as 5G and LTE, Bluetooth, Near Field Communication (NFC), LAN, RFID, ZigBee, Wi-Fi, Low Power Wide Area Network (LPWAN), Wide Area Networks (WAN) and Metropolitan Area Network (MAN).

Each of these technologies enables different capabilities. For example, RFID makes it possible to attach a tracking chip to a physical item—anything from a medical device in a hospital to a pallet that contains food products or medicine. ZigBee and a similar protocol called Z-Wave use a mesh network to transmit data—even when a cellular or wired connection isn’t available.

However, it’s often possible to link and combine different technologies such as Bluetooth Low Energy, Z-Wave and 804.15.4 through IPv6 over Low-Power Wireless Personal Area Networks (6LoWPAN), an open standard that supports low-power radio communication to the internet.

Much of the data collected by IoT devices is streamed to the cloud or managed in Edge and Fog systems, which can store and sometimes process data away from a central server or cloud. This model makes it possible to introduce far more advanced real-time capabilities that are needed for systems such as digital twins, smart manufacturing and smart cities.

Also see: Best Data Analytics Tools

Data Drives IoT

At its heart, the IoT is a sensing system. The “eyes, ears, nose and fingers” of this connected world reside in various devices, sensors and chips. They collect the data that feeds insight, automation, AI and other functions. Today’s IoT sensors can detect movement and motion, temperature, pressure, gas and chemical concentrations, magnetic and electrical fields, light, sound and much more.

This makes it possible, for example, to determine when a bridge or tunnel requires repairs, how to optimize performance across a subway or train network, and when a specific event has occurred. In the latter case, a motion sensor might switch on a security camera if someone enters an unauthorized space.

The IoT can also aid in keeping machinery running, support connected healthcare, improve supply chain visibility, personalize media content, and aid in managing infrastructure and equipment, such as storage tanks and transport vehicles.

All the data collected by sensors can also identify patterns and trends that escape human eyes and minds. For instance, the IoT can help a credit card company identity fraud, an ice cream manufacturer know which flavors are more popular in different locations, and a healthcare provider spot factors that contribute to sickness and disease.

Weather forecasting has become far more accurate in recent years due to connected weather stations that feed data on a block-by-block level. In many cases, homeowners and businesses install connected weather stations and the data is aggregated by the IoT and used to generate more granular predictions. This data benefits farmers, importers, logistics companies and many others.

Geolocation is a key factor in connecting all the data dots. It makes it possible to understand data in deeper ways—and in context—but it also enables more advanced capabilities that allow an E-ZPass toll system or a company such as Lyft to perform its magic. The possibilities are nearly endless. Identifying cars, scooters or pollutants, and overlaying services, payments and solutions suddenly becomes possible with IoT geolocation data.

Of course, a key consideration for organizations looking to extract the maximum value from the IoT is to ensure that the data is accurate. One of the biggest challenges revolves around how data is combined—and what data to exclude. Data scientists must ensure that software and other tools pull the right data at the right time, sequence it correctly, and combine it in a way that generates meaningful results.

When data is assembled and sequenced correctly, it’s possible to gain highly granular snapshots of actions, movements, events, behaviors and conditions. This can help businesses develop more elastic and dynamic pricing models, highly accurate predictive maintenance frameworks, smarter sourcing and supply chains and much more. As organizations slice, dice, crunch and analyze all this data—often through machine learning (ML) and deep learning (DL) techniques—valuable insight, information and knowledge follow.

Also see: Top Data Visualization Tools

IoT Means Business

According to the World Economic Forum, the IoT is part of the Fourth Industrial Revolution. A convergence of various digital technologies enables things like precision agriculture, smart factories, digital twins, fully autonomous vehicles and robotics, automated warehouses and grocery stores, and entirely new business models, which weren’t possible before the IoT existed.

Digital twins are a particularly compelling solution. These systems—which typically extract data from buildings, roadways, storage tanks, engines, manufacturing equipment, products and numerous other sources, create virtual models or “twins” of actual products, services, and systems. In this simulated space, it’s possible to explore various options and scenarios, and understand performance and possible outcomes at a far deeper level.

Plugging into a growing collection of standards, protocols and APIs, IoT systems can connect and interconnect in increasingly sophisticated ways. It’s possible to tap into broader technology platforms that intersect with AI and automation. Increasingly, vendors such as AWS, Google, Cisco, Microsoft, IBM, SAP and others offer IoT and Industrial IoT (IIoT) platforms that combine various functions and deliver plug-and-play capabilities.

These vendor platforms now include highly integrated tools for auto-provisioning, data collection, cloud and edge storage, remote device management, analytics and machine learning, policy enforcement and security. Many of these systems are highly scalable and many include templates and functions designed for specific industry verticals, such as manufacturing, healthcare, finance and transportation.

This leads to potential gains in several areas: workers safety, production uptime, product quality, regulatory compliance, operational efficiencies, physical security and much more. Together, the IoT and IIoT can deliver visibility into conditions that extend outside the enterprise.

Today, IoT platforms serve as a valuable resource for both small and larger enterprise. They reduce—and in some cases eliminate—the need to assemble an IoT framework from scratch and devote staffing and financial resources to the task of maintaining and updating a connected technology and business framework.

Also see: Top Digital Transformation Companies

How Secure is IoT?

An inconvenient truth is that the IoT is not inherently secure. Because it is comprised of a mishmash of protocols, standards and vendors, any enterprise venturing into a highly connected world must take abundant precautions and enact strong safeguards. In addition, as organizations accumulate large numbers of IoT devices—sometimes reaching into the millions—the security challenges multiply.

It’s no secret that data breaches and ransomware attacks have reached epidemic proportions. During the pandemic, the problem worsened. As more and more connected devices appear, potential gaps and breakdowns grow. Making matters worse, conventional security tools don’t necessarily protect IoT devices and data.

Part of the security challenge also centers on the way IoT devices are built—and what operating system and software they use. Unfortunately, equipment vendors frequently use legacy BIOS and OS standards that are not equipped for today’s environment. Many also don’t provide regular patches and updates to address security bugs and other problems.

Already, criminals have taken over Internet connected baby monitors, commandeered smart refrigerators and television sets, and gained access to automobiles and medical devices.

Ultimately, IoT security must address three primary areas:

Authentication

Device authentication increasingly revolves around protocols, such as X.509, that verifies devices, gateways, users, services, and applications. The X.509 cryptographic standard uses self-signed or authority-signed public key certificates that validate identities over a network.

Encryption

Encryption typically encompasses the Wireless Protection Access 2 (WPA2) standard of network encryption.

Port protection

Port protection techniques revolve around disabling ports that aren’t required to operate an IoT device or ensuring that they are protected by a firewall.

Fortunately, a growing array of vendors specialize in IoT security. They are introducing more integrated and streamlined security for this highly connected business world.

Also see: Data Mining Techniques

Ethical and Privacy Concerns Related to IoT

There’s also a growing focus on privacy and ethics. As data becomes more connected—and interconnected—both the challenges and risks grow. The European Union’s General Data Protection Regulation (GDPR), which took effect in 2018, introduced strict rules, regulations and penalties for organizations handling data that touches European citizens.

In 2020, California introduced a new law, California Consumer Privacy Act (CCPA), which requires manufacturers to include “reasonable” security features in IoT devices. It also introduced standards for companies doing business in California, and penalties for violations and data breaches. A major violation could result in fines of US $2,500 to US $7,500 per violation as well as action from the California attorney general’s office.

The concerns don’t stop with laws and regulations, however. Privacy experts and a growing segment of the public are increasingly vocal about how facial recognition data, healthcare data and other forms of IoT data are collected and used. As a result, businesses relying on the IoT must take a close look at several key areas and tools, including:

Data personalization
Data de-identification, and re-identification
Data persistence
How IoT data is stored and retained

Also see: Digital Transformation Guide: Definition, Types & Strategy

IoT: a Connected World is Taking Shape

Sitting on the sidelines and waiting for the IoT to mature is no longer a viable option for most organizations. The IoT and IIoT are here now, and they are valuable tools for businesses of all shapes and sizes. Industries as diverse as finance, manufacturing, agriculture, construction, energy, transportation and healthcare are witnessing enormous changes as a result of connected technologies.

Consulting firm McKinsey & Co. reports that the worldwide number of IoT-connected devices is projected to increase to 43 billion by 2023, an almost threefold increase from 2018. The firm also notes that the IoT is an increasingly critical factor in determining which companies excel and generate growth and new sources of revenues.

What’s more, sensor technology and other IoT components are becoming cheaper and far more powerful. When organizations combine these components with powerful 5G, cloud, edge and fog technologies, the power of the IoT further multiplies—and the value to organizations grows. Adding augmented reality, virtual reality, robotics and various forms of AI often unleashes new and compelling business models.

The opportunities don’t end at the four walls of the enterprise. The World Economic Forum found that 84% of existing IoT deployments address, or have the power to advance, the UN’s Sustainable Development Goals. As organizations develop and advance sustainability programs and aim for aggressive carbon reduction targets, the IoT will play a crucial role in measuring progress, identifying gaps and achieving objectives.

To be sure, the IoT is at the center of everything as businesses look to modernize and gain advantages through innovation, cost cutting, new features and services, and improved interactions with business partners and customers. The IoT is ultimately about dollars—and good business sense. Enterprises that put it to work effectively witness innovation and transformation on a scale that hasn’t been possible in the past. These enterprises achieve the full promise of digital transformation.

Also see: Real Time Data Management Trends

The post What is IoT? Guide to the Internet of Things appeared first on eWEEK.

BigQuery vs. Snowflake: Data Warehouse Comparison 2022

Samuel Greengard — Wed, 06 Jul 2022 23:30:42 +0000

Google BigQuery and Snowflake are both leading data platforms. Both offer a wealth of data analytics features, capabilities and tools designed to take enterprise data services to a higher level.

Data warehouses have served as valuable tools for organizations for more than three decades. These repositories – now cloud-based – help organizations pull together and consolidate data from disparate sources. They typically support a variety of functions, including artificial intelligence, data mining, data analytics, machine learning and decision support functions.

Data warehouses are fast, flexible and powerful – particularly as organizations look to expand digital transformation and incorporate robotics, IoT, deep integration and API support and other functions.

There are crucial differences between Google BigQuery and Snowflake. This article offers an in-depth comparison of these two leading data warehouse platforms: how they match up, along with some of their key differences.

Also see: Best Data Analytics Tools

BigQuery vs. Snowflake: Feature Comparison

BigQuery: Google’s reputation for providing powerful data frameworks and tools extends to BigQuery. It delivers a fast, highly flexible and scalable data warehousing solution that deftly handles both structured and unstructured data.

This serverless multi-cloud environment is designed to “democratize insights with a secure and scalable platform with built-in machine learning,” according to Google. BigQuery is a multicloud analytics solution that can accommodate a data warehouse ranging from only a few bytes to petabytes. The platform supports predictive modeling and machine learning, multicloud data analysis, interactive data analysis and geospatial analysis, along with numerous other data capabilities.

Snowflake: What makes Snowflake appealing is its focus on flexibility and scalability for huge quantities of data. The platform, which is delivered as a service, can automatically scale up and down without any impact on performance. The multi-cloud shared data architecture handles a vast array of workloads and tasks that revolve around data engineering, data warehousing, data lakes, data science and more.

Snowflake delivers ultra-high resiliency, and it delivers an architecture that supports modern standards, including security and data governance. Organizations can run the platform on AWS, Azure and Google Cloud—or any combination. Snowflake also delivers strong collaboration and data sharing features. It is ideal for modern integrated data applications, and it has strategic alliances and partnerships with Salesforce, Alation, Cognizant, Collibra, Dataiku, Informatica, Qlik, Talend and many others.

Also see: Top Data Mining Tools

BigQuery vs. Snowflake: Architecture Comparison

BigQuery: The platform relies on a serverless multi-cluster framework that keeps compute and storage layers separate. Google handles all resource provisioning behind the scenes and supports clustering on both partitioned and non-partitioned tables. These tables are durable, persistent, optimized and compressed for power and speed.

This massively parallel environment relies on thousands of CPUs to read data from storage. It supports almost all major data ingestion methods, including Avro, CSV, JSON and Parquet/ORC. One of the big advantages to BigQuery is its auto-replication across global data centers. This greatly minimizes the risk of service interruptions and downtime.

Snowflake: The platform offers a hybrid system that combines traits from traditional shared-disk and shared-noting architectures. It delivers a multi-cluster approach to auto-scale based on demand.

Because Snowflake has a built-in separation layer between storage and compute, it’s extremely fast and flexible. For instance, micro-partitioning accommodates structured, semi-structured and unstructured data, and the platform delivers an extensive set of connectors and drivers, including Spark, Python, .NET and Node.js. It supports most SQL commands, including DDL and DML. It’s possible to isolate data and groups, and even run different applications from a single source of data.

BigQuery vs. Snowflake: Comparing Key Tools

BigQuery: The data platform delivers a wealth of features and integrates with other Google data tools, including Vertex AI and Data Studio. BigQuery ML helps data scientists and data analysts build and use machine learning models through structured and semi-structured data, with SQL. It imports and ingests most major file types using connectors and plugins, including data from SAP, Informatica and Confluent.

BigQuery Omni delivers multicloud analytics and connects seamlessly to AWS and Azure. BigQuery BI Engine delivers analytics on complex databases with sub-second response times. And BigQuery GIS supports geospatial data analysis, with support for most mapping and charting formats. In addition, the platform provides AutoML Tables, a codeless GUI that automates tasks and guides users to the best model, and ML features that support various approaches, including Logistic Regression, K-means and Naïve Bayes. It is ANSI SQL compliant.

Snowflake: The platform handles just about every data science challenge an organization can throw at it. Common workloads include application building, collaboration, cybersecurity, data engineering, data lakes, data science and data warehousing. It is equipped to handle requirements across a wide swath of industries, offering a rich set of tools to handle every aspect of data ingestion, transformation and analytics, including unstructured data. A schema-on-read feature allows data scientists to build pipelines without the need to define a schema ahead of time.

Snowflake supports BI, analytics and machine learning at scale. The ML solution allows users to plug in a tool of choice, with native connectors and robust integrations from a broad ecosystem of partners. The platform also provides powerful tools for building data applications with autoscaling and native support for data structures.

Snowflake’s developer framework, Snowpark, supports a variety of programming languages and functions, including Scala, Python, Java and JavaScript. This code runs directly inside Snowflake and leverages its processing engine with no other system or modifications.

Recent Snowflake enhancements include a tool for ARM customers that makes it easier to leverage and manage the lifecycle of their data in a single location, using a single data set; and a data-driven framework for decision making that delivers applications directly to data, thus eliminating the need to move sensitive data between systems.

A new Snowflake Native Application Framework allows developers to build, monetize, and deploy applications on Snowflake Marketplace. Consumers can securely install and run these applications directly on their data inside Snowflake.

Also see: Real Time Data Management Trends

BigQuery vs. Snowflake: Interface Comparison

BigQuery: As part of Google Cloud, BigQuery offers a cloud console with a graphical user interface (GUI) that’s used to create and manage resources and run SQL queries. The console also offers visibility into various resources, including cloud storage.

Snowflake: The web interface is accessible through Chrome, Firefox, Safari, Opera and Edge browsers (though the company recommends Chrome). The platform delivers a single view into resources and functions. Snowsight, the vendor’s web interface, delivers SQL and other functionality.

BigQuery vs. Snowflake: Comparing Backup and Recovery

Big Query: With data centers located all over the world and auto-replication always-on, there’s virtually no chance of losing data. Google relies on a data backup and recovery framework that lets users query point-in-time snapshots over 7 days of data changes.

Snowflake: The vendor doesn’t operate a dedicated backup system. Instead, it uses a fail-safe technology that recovers system failures for the prior 7 days.

Also see: What is Data Visualization

BigQuery vs. Snowflake: Security and Compliance Comparison

BigQuery: The platform integrates with various Google security and privacy services, including Identity and Access Management (IAM) to handle roles and permissions. In addition, BigQuery offers both column level and row level security with controls over key functions, along with default encryption at rest and in motion. It includes strong governance and compliance features. Part of Google Cloud, it supports HIPAA, FedRAMP, PCI DSS, ISO/IEC, SOC 1, 2, 3, and others.

Snowflake: The company offers comprehensive security features, including private network access to all three clouds it uses, dynamic data masking and end-to-end encryption for data at rest and in motion. Snowflake also provides strong identity and access controls built on OAuth and SAML, along with fine-grained governance. Its Enterprise + tier offers HIPAA support, and it is PCI compliant. In addition, a Virtual Private Snowflake (VPS) option offers customer-dedicated virtual servers. It also supports FedRAMP, DSS, ISO/IEC, SOC 1, 2, 3 and others.

Also see: Data Analytics Trends

BigQuery vs. Snowflake: Comparing Support

BigQuery: Google offers basic, standard, enhanced and premium support. Basic is included for all customers; it includes community support and online documentation. Other tiers are available with varying features and prices. Google’s knowledge base is extensive and there is a large and active online community.

Snowflake: The vendor offers professional service in the form of Service Engagements, which pair Snowflake domain experts with an organization’s IT staff. Support comes in two categories: Premier and Priority. Both offer an unlimited number of cases and tickets across AWS, Azure and Google Cloud, but the Priority level prioritizes responses and includes several features that aren’t available in the Premier tier. There’s also an extensive online knowledge base and a large and active online community.

Also see: Top Business Intelligence Software

BigQuery vs. Snowflake: Price Comparison

BigQuery: Google charges for data storage, streaming inserts, and data queries. However, there’s no charge for loading and exporting data. Storage costs $.02 per gigabyte per months, and $.01 per month for long term storage.

Streaming inserts cost $.01 per 200 megabytes. Users have a choice of two data analysis pricing models: on-demand pricing and flat-rate pricing. The former runs $5 per terabyte, with the first terabyte per month free. Flat rate pricing starts at $1,700 per month for a dedicated reservation of 100 slots. Google charges $4 per hour for 100 Flex slots.

Snowflake: The company has a fairly complex pricing model that’s dependent on the platform (AWS, Azure or Google Cloud) and region. For instance, AWS and US West (Oregon) varies across four tiers. The Standard Tier offers a complete SQL data warehouse, always-on encryption, federated authentication and customer-dedicated virtual warehouses at $40 per terabyte per month on-demand storage plus $2 per credit (a unit of resource measure) once an organization has reached their purchased capacity.

The enterprise plan also cost $40 per terabyte per month for on-demand storage plus $3 per credit. It includes numerous other features. A Business Critical Enterprise Plus plan runs $23 per month for capacity storage with $4 cost per credit. It includes other advanced features, including database failover and fallback.

BigQuery vs. Snowflake: Conclusion

Both platforms deliver state-of-the-art data warehousing and science features, and they are both exceptionally powerful, flexible and scalable. Much of the decision depends on what vendors and platforms a business already relies on, and which of these two vendors is a better fit for storage and compute, including pricing.

BigQuery may have a slight edge for data mining and organizations that have variable workloads, while Snowflake has a slight advantage for organizations that require nearly unlimited automatic scaling.

Also see: Top AI Software

The post BigQuery vs. Snowflake: Data Warehouse Comparison 2022 appeared first on eWEEK.

Heroku vs. AWS: 2022 Cloud Platform Comparison

Samuel Greengard — Wed, 25 May 2022 22:02:11 +0000

Today, choosing the right cloud provider – or group of cloud vendors – is critical. Yet performance and costs are merely starting points. Organizations also require agility and flexibility to expand into digital initiatives, ranging from IoT and robotics to digital twins and machine learning. The ideal cloud platform serves your unique business needs.

Two cloud platforms garnering attention are AWS and Heroku. Of course, the former is a household name. Amazon Web Services dates back to 2006 and now controls 41% of the cloud market. The cloud giant offers a vast array of capabilities and features. In contrast, Heroku, founded in 2007 and now part of Salesforce, slants heavily toward developer-centric cloud services.

It’s best to think of AWS as a general-purpose platform for the cloud – although it has many tools and capabilities that are appealing to developers and data scientists. Heroku’s focus is on harnessing and monetizing Salesforce data – although the cloud platform is suited for numerous other development purposes. Its strength lies in built-in support for a wide range of development languages.

Here’s a close look at how these two cloud providers stack up and what you need to know if you’re in the market for enterprise clouds that may involve some software development and DevOps support.

Also see: Top Cloud Companies

Heroku vs. AWS: Overall Comparison

AWS: Amazon Web Services is an end-to-end platform with a global presence. It offers more than 200 products, services, tools and resources for building and managing computing frameworks. This includes services designed for computing, storage, networking, database, analytics, machine learning, IoT and mobile. AWS supports .NET, Docker, Ruby, NodeJS, Go, PHP and Python, among others. It is ideal for companies spread across a diverse array of industries and spaces, including financial services, media and entertainment, retail, marketing and advertising, game tech and others.

On the back end, AWS includes web services, including clusters available through Amazon Elastic Compute (EC2), Amazing Simple Storage Service (Amazon S3), GPU availability, and various operating systems and configurations that optimize various tasks, ranging from software development to customer relationship management (CRM).

Amazon uses a robust set of APIs to deliver functionality to developers, though it’s usually necessary to manually configure settings. Typically, fees are based on a pay-as-you-go (consumption-based) model.

Heroku: The cloud provider delivers a cloud platform that tilts heavily toward a software development platform. It offers deep integration with Salesforce data, but it also gives developers a rich set of tools to build and deploy web applications. This includes language support for Ruby, Java, PHP, Python, Scala and Node.js. Heroku offers more than 200 third-party add-ons, 7,800 plus open-source Buildpacks, and upwards of 7,200 ready-to-deploy Heroku Buttons. The company’s focus is on building data-driven apps with fully managed data services.

As a result, Heroku is ideal for organizations looking to build apps without regard to the underlying infrastructure. Heroku automatically adapts itself to the needs and requirements of the software an organization builds.

Applications that run on the Heroku platform typically use a unique domain that routes HTTP requests to a specific applications container. These, in turn, distribute the compute load across multiple servers that the company operates. Heroku services are hosted on Amazon’s EC2 platform.

Also see: Why Cloud Means Cloud Native

Heroku vs. AWS: Comparing Usability and Performance

AWS: Part of the appeal of AWS lies in its breadth of services and capabilities. It offers applications and tools for almost any task—something that Heroku cannot boast. But this comes at a potential cost. There’s a far greater reliance on manual configurations, which can tax organizations.

On a practical level, AWS offers numerous services but the three most widely used are AWS EC2, AWS Elastic Beanstalk, and AWS Lambda. The first delivers building blocks that teams must assemble into functional infrastructure for a project; the second is a platform-as-a-service (PaaS) offering that deploys apps through AWS Cloud commands (it’s the service that most closely rivals Heroku); and the third is a serverless platform that runs code but lacks customization features and scalability because it’s a closed architecture. Among user communities, there are some complaints about murky error codes and an inability to resolve issues. Highly technical aspects can also prove daunting.

Heroku: The vendor’s interface and framework has a distinct advantage over AWS. It’s far more user friendly for accomplishing a wide array of tasks. It includes a powerful dashboard and command line interface (CLI). Yet one of Heroku greatest strengths is its built-in smart containers and elastic runtime capability. These dynos support powerful orchestration, load balancing, security and logging through the elastic runtime environment.

Heroku also makes it easy to connect from Git, GitHub, or Docker, or through an API, in order to establish automated application delivery. It boasts single click scalability with no downtime, and it is highly configurable and flexible through open source Buildpacks and Buttons, which are available through the firm’s marketplace. Its broad language support is also a plus. On the downside, user communities report that dynos can be difficult to reach at times, and heavy computing projects don’t always run well on the platform.

Heroku vs. AWS: Comparing Flexibility

AWS: The strength of AWS lies in its powerful infrastructure, fast provisioning and deployment and vast array of resources and tools. The cloud platform is ranked number one in market share for a reason: if you want to build a service at AWS—blockchain, content delivery, machine learning, high performance computing, you name it—there’s almost certainly a way to do it.

However, there’s a caveat. AWS tilts heavily toward its own tools and resources. While it’s possible to connect outside services and use various development languages, the manual nature of building out components can serve as an impediment. If your organization requires a highly flexible infrastructure, AWS is probably an excellent choice—provided that you have the resources to get the job done.

Heroku: The cloud development platform’s ease of use makes it ideal for organizations looking for simplicity along with flexibility. While it doesn’t match the infrastructure capabilities of AWS and doesn’t offer as broad a set of features and capabilities, it shines in areas such as collaboration, container support, open-source connectivity, language support and controls.

What’s more, the high vertical and horizontal scalability of the platform means that organizations can adapt and adjust to changes rapidly. An added bonus is that Heroku integrates well with AWS products and tools, meaning that it’s possible to extend its capabilities. It’s also extremely easy to set up, deploy and change.

Heroku vs. AWS: Security, Privacy and Compliance

AWS: Strong security controls are embedded in the platform, and developers and others also have access to many other powerful tools, features and capabilities. AWS holds third party validations for thousands of global compliance requirements. Encryption at rest and in motion is in place across facilities and geographies. AWS has a team of expertise that continually monitor and respond to issues. Additional protections—everything from identity and access controls and application security to host and endpoint security—are available through the AWS Marketplace.

Heroku: With validated compliance built into the majority of the stack, Heroku delivers strong controls over data and privacy. The company performs regular audits for PCI, HIPAA, ISO, and SOC1, SOC2 and SOC3, though different products and tiers offer differing levels of compliance. The company offers robust security controls at every layer, from physical to application; it isolates customer apps and data; and it relies on strong authentication and encrypted connections along with numerous other protections to deliver an ultra-high level of security and data protection.

Also see: Cloud Native Winners and Losers

Heroku vs. AWS: Comparing Support

AWS: Resources include e-books and technical documents, help from third-party experts, an extensive knowledge center and a variety of free and paid support tiers and options. Not surprisingly, AWS also has an enormous community that can provide input and assistance.

Heroku: The vendor offers a help center with an extensive knowledge base. It also offers a current status monitor at its website, and it offers other community resources as well as direct online support.

Heroku vs. AWS: Price Comparison

AWS: The pay-as-you-go model offers a high level of flexibility. AWS includes a pricing calculator to aid in determining pricing for various components and solutions. Costs vary greatly, depending on the configuration. However, an EC2 a1.medium instance with 1 vCPU and 2 GiB of memory costs about $0.0255 on-demand hourly, or approximately $18 per month. A t3.2xlarge instance with 8 vCPUs and 32 Gib of memory runs about $243 per month.

Heroku: The company offers a Free and Hobby package at $0 per month. It has limited capabilities. Other plans vary from $25 per month to $250 per month or more, depending on numerous factors. Containers (Dynos) vary greatly depending on the configuration. For example, the Dyno standard-2x plan with 1 GB RAM runs about $50 per month. The Dyno performance-L with 14 GB RAM costs about $500 per month. Data service and various other offerings are also available.

Also see: How Database Virtualization Helps Migrate a Data Warehouse to the Cloud

Heroku vs. AWS: Conclusion

Both cloud platforms deliver remarkable capabilities and features. Overall, AWS offers a more powerful and flexible infrastructure, and, in some cases, superior automation capabilities coupled with greater control over resources. It’s ideal for organizations that have a DevOps team or developers available to customize and tweak features and settings. As a result, it’s often a bullseye for medium and large organizations.

On the other hand, Heroku takes a narrower focus on code rather than infrastructure. It supports a broad array of development languages, including modern open-source languages. This makes it more appealing to small and medium-sized companies with limited development teams. Heroku offers a wealth of tools and a user-friendly environment that makes it easy to provision and deploy services, sometimes within seconds. It also integrates well with AWS products and offers superior connectivity to Salesforce.

The post Heroku vs. AWS: 2022 Cloud Platform Comparison appeared first on eWEEK.