BETA
This is a BETA experience. You may opt-out by clicking here
Edit Story

As Big Data Booms, SQL Makes A Comeback

Oracle

The death of SQL has been greatly exaggerated.

With all apologies to Mark Twain, the demise of Structured Query Language (SQL), having been forecasted for a while, is far from a done deal. In fact, the opposite is closer to the truth — SQL (pronounced “sequel”) is emerging as a standard data access method for the Big Data and NoSQL technologies that were supposed to represent its Waterloo.

“A couple years ago everybody was like, no, we don’t need SQL,” says Andy Mendelsohn, Executive Vice President, Oracle Database Server Technologies, expounding on current trends in data management during a recent webcast. “Now, after a few years, everyone has figured out, hey, we need SQL, it’s really critical.”

Structured Query Language emerged in the late 1970s as the standard data access method, both defacto and dejure, for relational databases, which, due to their logical and consistent way of storing data in tabular format, rows and columns, were rapidly proliferating as the most effective enterprise data management systems.

Over the past 40+ years SQL has undergone continuous improvement through vendor participation in the American National Standards Institute (ANSI) and today by the Joint Technical Committee (ISO/IEC) of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). While still referred to as Structured Query Language today’s SQL includes the capability to handle fully unstructured (text) and semi-structured (documents in XML, JSON formats) information, providing unified access to data in such formats and operations as pictures, text, spatial and graph data, and data mining.

Fast-forward to today’s Big Data initiatives, efforts to leverage the large amount of structured, semi-structured, and unstructured data that’s being generated especially in connection with Internet-oriented endeavors such as e-commerce, social media, and the Internet of Things (IoT). The ability for organizations to get their arms around the Big Data opportunity was hindered by the explosion of new and different access methods and languages to query and access Big Data — and SQL took a back seat. It’s not for nothing that one of the popular Big Data platforms is known as NoSQL.

Users soon realized that Big Data technology was something of a blunt instrument. “You can store bits in any kind of database,” says Mendelsohn. “But you can’t always do intelligent things with those bits.”

That’s where SQL shines, having been honed for years by database experts, Oracle chief among them. “SQL is a standard and all the vendors in this space follow that standard,” Mendelsohn points out. “That’s a big thing you find missing in the NoSQL world.”

For example, a popular data format being used in certain NoSQL database platforms is JSON, which stands for JavaScript Object Notation. Oracle has worked to extend the SQL standard to support JSON operations, “so people can do intelligent things with JSON,” Mendelsohn says.

The most popular Big Data computing platform is Hadoop, which is based on a programming framework called MapReduce. “A few years ago the Hadoop community was saying MapReduce is the developer API that’s going to replace relational databases — you don’t need SQL anymore,” Mendelsohn remarks. But a funny thing happened on the way to MapReduce’s dominance: complexity.

It turns out that very few developers  have the skill sets required to make effective use of MapReduce. So suddenly the Hadoop community has discovered the benefits of SQL. “Everybody and his brother is trying to build out a SQL engine that works on data stored in HDFS [Hadoop Distributed File System],” Mendelsohn says.

Because Oracle is the leader in database technology, it is evolving its SQL technology to incorporate and integrate Big Data structures into enterprise data storage environments. To that end, Oracle recently introduced a product called Big Data SQL, now a part of its Oracle Big Data Appliance – a hardware and software solution for Big Data.

And while SQL for Hadoop, in itself, is an admirable goal, Oracle has something more in mind. “We’re now delivering Oracle SQL, the same SQL our customers have been using for years — full functional Oracle SQL dialect, no compromise, full Oracle query optimization, full Oracle parallel query algorithms — not only against data stored in Oracle database, but now we can integrate data that’s stored in Oracle databases, NoSQL, and Hadoop into a single SQL query,” Mendelsohn says.

That’s important because most enterprise customers don’t see it as a contest between Big Data, NoSQL, and relational database technology. “It’s not an ‘either/or’ thing, it’s an ‘and’ thing,” Mendelsohn maintains. “They don’t want information silos; they want information integration.”

Other Benefits

In fact, many enterprise customers are running ahead of the market. “When customers use Hadoop and NoSQL in our accounts they’re very often building systems that are combinations of these different technologies working together in interesting ways,” Mendelsohn says. So Oracle is simply following their lead. “We’re building innovations that span these technologies so our customers can build out the systems they want.”

Moving Oracle’s highly developed SQL technology into the Big Data environment brings along with it all the sophisticated features and functions that have been added over the years, not the least of which have to do with one of Big Data’s most glaring limitations: security.

“Another nice trick here is we can extend the very mature database security mechanisms and policies in Oracle databases to now apply to data that’s in these other databases as well,” Mendelsohn points out. So customers can limit users’ access to critical information by enforcing the policies and rules that are available in Oracle databases for security authentication and authorization, extending that to data in Hadoop and NoSQL.

As for the SQL standard, it’s interesting to note that some members of the NoSQL community are hedging their bets by evolving the reference in the name to the more inclusive “Not Only SQL.”

So it’s not so much a return to significance for SQL as recognition of its continuing relevance. “It’s always been important,” Mendelsohn asserts, “but now it’s back in fashion.”

Watch a replay of Andy Mendelsohn’s “Emerging Trends in Database” webcast here.

For more on SQL and Big Data, check out the Oracle OpenWorld 2014 agenda, where there are dozens of sessions on how these technologies can help advance an enterprise IT strategy.

Source: iStockphoto