Big Data is neither equal to the fancy new technologies nor a huge database, it is a full step-by-step process pipeline.
Big Data as a big phenomenon causes big attention, however, people are getting confused about the difference between tools and the knowledge in what everyone is calling now Big Data. Data analysis, and Big Data is not Hadoop or SQL or Python or R. These are just tools that help you analyze data. Getting Big Data right is about more than the size of your database, or the fancy tools, it is the full step-by-step process PIPELINE from collecting and storing data to analyzing and visualizing data to extract business insight & foresight from it.
Put your Big Data"fit-for-purpose." Data analysis is not Hadoop, it is much more than the simple technical tools. Data Analysis is a process that needs connecting to databases, or a data server provider or creating the database using a crawler. Then analyzes and visualizes the data. Data storage is only a part of the whole Big Data business logic. It seems like too many people think if you aren't using Hadoop, you are not doing Big Data; many people are thinking or inquiring if Big Data = Hadoop and vice versa. That's why you have to put emphasis on "fit-for-purpose." Big data is about "data collection" (crawling for example) + "data storage" +data analysis & visualization. As for analysis part, RDBMS is a data warehouse after ETL over distributed data sources, Data analysis as final stage is about data dimensional analysis not about amount analysis. selecting enough samples is to ease random errors so that the result is stable.
There are pros and cons with various approaches to Big Data. Big Data is "bigger than the size of your database." Solving a problem and optimizing the algorithms often do need to collect huge amount of data. The more data you collect, the better the results it get. However, the reality does not allow you to have all the data you want, so there'll be time that you need to be selective in choosing the best proxy available. Also, you should realize there are pros and cons with various approaches to Big Data, but users rather have it 80% right than wait until there is a perfect prediction. The criteria to select different platforms and tools is to improve speed and scalability, although it is sometimes difficult to make real apples-to-apples comparison.
The five Vs (volume, volatility, variety, veracity, and value) of Big Data is its very characteristics. Still, it is means to the end -to achieve business value, not the end itself. The goal to get Big Data right is to ensure the data as the lifeblood of modern business can nurture the whole body of business and keep it fit, energetic and resilient in order to adapt to the accelerating speed of digital dynamic.
2 comments:
Thank you for providing useful content Big data hadoop online training
It is a great pleasure to read your message. It's full of information I'm looking for and love to post a comment that says "The content of your post is amazing". Excellent work.
Python classes in Ahmednagar
Python course in Ahmednagar
Python Training in Ahmednagar
Post a Comment