With IoT, information can be gathered at a faster, and more accurate rate which would produce a greater result. Companies such as Facebook, YouTube, Google, and Amazon all collect and analyze data from users to improve their platform, whether the user may be more prone to purchase to creating models for new technologies they are developing. It has been used to save lives and reduce damage by predicting storms. It is important to understand that details are pointless to collect unless utilized with a purpose and meaning. Thus, leading to the concept of the five V’s.
Big data contains five characteristics, Volume, Velocity, Variety, Veracity and Value which are often referred to as the five V’s. The first of the Vs, Volume refers to a large amount of information. The amount of details required to be considered as big data is constantly changing, with new technologies emerging, the definition changes with it. In 1999, 1GB or one gigabyte of data was considered as big data, however in modern times. The volume is to be large enough that a normal data management tool is unable to process with efficiency.
The Second V, Velocity, is the speed of which the input is coming in. With all the figures, features and statistics, it is required to be processed, managed, and eventually used to some purpose. Companies such as Facebook and Twitter gather large quantities of information daily by using IoT devices which are linked to the cloud. That statistics will eventually be processed and used to help boost their company or develop new applications.
Variety, the third V, refers to the type of information. There are three types of data, structured, semi-structured and unstructured. These can come in all forms ranging from excel spreadsheets to log files and even images. Structured types are organized to have a certain set of formatting, generally set in a relational database or tables such as SQL whereas its counterpart, unstructured data are unorganized that doesn’t conform to any requirements and generally can’t be stored in rows and columns. Unorganized data comes in the form varying from text to audio. The final type, semi-structured, are sets of information that contain a certain rule, however, do not have a fixed schema. Semi-structured types are also included, but are not limited to XML, JSON, non-relational databases, and log files.
Veracity, the fourth V, is the accuracy and trustworthiness of the detailed information. In big data, removing bias, inconsistent or duplicate evidence will result in a better data set. By doing so, the accuracy of the data will improve, resulting in a more valuable set of documentation. With IoT applications, the cleanup of essential details is reduced because there is no human interaction interfering. According to IBM, businesses lose $3.1 trillion annually due to poor data quality.
The final V, Value is the purpose of the detail statistics. Companies may utilize their documentation to improve their technology or help reduce costs. For companies such as Tesla, their information is used to improve their automated driving features whereas for medical companies such as Pfizer, they may use their previous and new particulars to create a new vaccine. With the increasing growth of big data, we gain new insights on unforeseeable events, technologies, improve the effectiveness of companies and ways of living.