The Characteristics of NoSQL ~ Future of CIO

Friday, May 1, 2015

The Characteristics of NoSQL

11:29 PM Pearl Zhu 3 comments

NoSQL means NOT ONLY SQL.
Big Data has its own fame and is stable in the market. Due to increase in data volume and scale out functionality, there is problem with database flexibility. Hence, NoSQL is emerged as “the next generation databases mostly addressing some of the points: being non-relational, distributed, open-source and horizontally scalable.” (NoSqlDatabase.Org). Is SQL to NoSQL an evolutionary or revolutionary journey? What’re the characteristics of NoSQL databases?

Historically, database has been through a few evolutionary shifts. In the 1990s, there’s a trend toward consolidation in the relational direction. At first relations databases were targeted toward query processing for reporting. The style had been to do the transaction systems in hierarchical or network databases, and then to extract data into a RDBMS for more flexible reporting. Then the RDBMs vendors drastically improved performance on the transaction side and for many organizations, everything was moved to relational. Then businesses tried to manage other content such as Word documents and other unstructured sources. This caused a problem. To solve the problem, a number of "content management" systems were developed. Then comes the Internet explosion and the drop in the price of storage and processing, and the emerging trend of Big Data with five “V”s characteristics. So you have to be open to using multiple technologies in an organization driven by the requirements of an application, you need to look at each application and match its requirements to the technology.

Data Distribution: First, the term distributed database is very correlated with NoSQL, because most of them either use partitioning or replication. So, in a certain way, most NoSQL databases are distributed. This is known as data distribution. Some RDBMS also have it, but when talking in Big Data (volume, variety, velocity), the relational databases are not very adequate, not only in terms of read speeds (because of its related nature and all the joins), but also in terms of scalability and fault-tolerance. A cluster of NoSQL databases for example have no single point of failure, so there is no master node, if one of the machines fail, the cluster is intelligent enough to allocate other machine until the problem is resolved, automatically.

Scalability: In a simple manner, it's the ability of a certain system to accommodate the growth of a certain request (such as storage and processing). Most of NoSQL databases are storage scalable by nature, because they are partition, replication and fault tolerant, with minimal human interaction. In most of NoSQL databases, besides the data distribution to achieve dirty read scalability (for example, divide the requests by multiple nodes in the cluster), you have a very important type of distribution, processing parallelism. This means the processing of a query can be distributed by multiple nodes, and merged at the end.

Different types of NoSQL databases: NoSQL isn't here to replace RDBMS, there is a place for each one of them. However there are certain applications where RDBMS are not adequate, due to its related nature that makes data read difficult and slow, also because of the infrastructure costs and licensing costs, and because it is hard to achieve distribution and scalability. NoSQL databases or Hadoop are designed to operate on a big cluster of commodity hardware, so it's more cost efficient. NoSQL’s model-less architecture is much more flexible than a relational one, meaning you can make changes to the structure without having to rethink the all model. That’s the main "selling point," because this makes the development of applications and analysis much more easy. There are four types of major NoSQL databases, which are commonly being utilized by the industry these days. These are:
(1) Document oriented Databases
(2) Column Family Databases
(3) Key-Value Databases
(4) Graph Databases.

DOMBA (Distributed Objects Management Based Articulation) is Cluster-Oriented NoSQL database. It has combined Graph and Document oriented Database features. DOMBA allows users to distill smaller units of related data without using complex structured query languages. DOMBA is human-oriented NoSQL, it will allow to write queries to distill both relationships in a graph structure. You can use DOMBA to manage Big Data, because your data will grow on everyday basis, and DOMBA is scalable and does provide eventual consistency to work with.
– DOMBA is schemaless NoSQL database. In reality, all objects have a certain identity and a relationship with another object.
– DOMBA allows to explore these relationship in human-oriented way.
– DOMBA has added ACID features within CAP theorem.
– Data Recharging is utilized in DOMBA.

NoSQL means NOT ONLY SQL. The emergence of NoSQL is to solve the data processing problems caused by Big Data. Organizations are much more willing to work with a multitude of DBMS that each meet a particular set of requirements. This requires more skilled people to development and maintain systems. For transaction data, you can still use an RDBMS. With all the variants of NoSQL, you just have to look at them in detail to see which might work for you.

Posted in: Big Data