Big data means a collection of data sets which are large and complex that describe a massive volume of both structured and unstructured data which makes it difficult to process using traditional database and software techniques.
The challenges would include analysis, curation, capture, search, sharing, storage and privacy violations.
Types of big data?
Big data could be of three types:
Structured data:The data that can be stored and processed in a fixed format is called as Structured Data. Data stored in a relational database management system (RDBMS) is one example of ‘structured’ data. It is easy to process structured data because of fixed schema. Structured Query Language (SQL) is often used to manage such kind of Data.
Semi-Structured data:Semi-Structured Data is a type of data which does not have a any formal structure of a data model, i.e. a table definition in a relational DBMS, but nevertheless it has some organizational properties like tags and other markers to separate semantic elements that make it easier to analyze. HTML files or JSON documents are examples of semi-structured data.
Unstructured data:The data which have unknown form and cannot be stored in RDBMS and cannot be analyzed unless it is transformed into a structured format is called as unstructured data.
Unstructured data files habitually include text and multimedia content which does not have any particular internal structure.For example: e-mails, videos, audio files, photo and many other types of business documents.
Examples of big data?
Daily we are uploading millions of bytes of big data and big data is growing exponentially.90% of big data is created in last two years.
Walmart handles more than 1 million customer transactions every hour.
Facebook stores, accesses, and analyzes 30+ Petabytes of user generated data.
230+ millions of tweets are created every day.
More than 5 billion people are calling, texting, tweeting and browsing on mobile phones worldwide.
YouTube users upload 48 hours of new video every minute of the day.
Amazon handles 15 million customer click stream user data per day to recommend products.
Applications of big data?
In today’s generation big data is used everywhere and In fact we are in the technology generation where new technology came in just one month or two months.
Smarter Healthcare: Making use of the Petabytes of patient’s data, the organization can extract meaningful information and then build applications that can predict the patient’s deteriorating condition in advance
Telecom: Telecom sectors collects information analyzes it and provides solutions to different problems. By using Big Data applications, telecom companies have been able to significantly reduce data packet loss, which occurs when networks are overloaded, and thus, providing a seamless connection to their customers.
Retail: Retail has some of the tightest margins, and is one of the greatest beneficiaries of big data. The beauty of using big data in retail is to understand consumer behavior. Amazon’s recommendation engine provides suggestion based on the browsing history of the consumer.
Traffic control: Traffic congestion is a major challenge for many cities globally. Effective use of data and sensors will be a key to managing traffic better as cities become increasingly densely populated.
Manufacturing: Analyzing big data in the manufacturing industry can reduce component defects, improve product quality, increase efficiency, and save time and money.
Search Quality: Every time we are extracting information from google, we are simultaneously generating data for it. Google stores this data and uses it to improve its search quality.
Challenges of big data?
Now let me tell you few challenges which come along with Big Data:
Data Quality — The problem here is the 4th V i.e. Veracity. The data here is very messy, inconsistent and incomplete. Dirty data cost $600 billion to the companies every year in the United States.
Discovery — Finding insights on Big Data is like finding a needle in a haystack. Analyzing Petabytes of data using extremely powerful algorithms to find patterns and insights are very difficult.
Storage — The more data and organization has the more complex the problems of managing it can become. The question that arises here is “Where to store it?”We need a storage system which can easily scale up or down on-demand.
Analytics — In the case of Big Data, most of the time we are unaware of the kind of data we are dealing with, so analyzing that data is even more difficult.
Security — Since the data is huge in size, keeping it secure is another challenge. It includes user authentication, restricting access based on a user, recording data access histories, proper use of data encryption etc.