Big Data: A Classification of Acquisition and Generation Methods

Vijayakumar Nanjappan*, Hai Ning Liang, Wei Wang, Ka L. Man

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingChapterpeer-review

6 Citations (Scopus)

Abstract

Traditionally, data have been stored in securely protected databases for special purposes, such as satellite imagery data for earth science research or customer transaction data for business analytics. The usefulness of data lies in the fact that they can be examined and analyzed to unearth correlations among data items and to discover knowledge to gain deeper insightful trends. Data analytics has been the key research topic in data mining, knowledge discovery and machine learning for decades. In recent years, the term "data" has experienced a major rejuvenation in many aspects of our lives. The rapid development of the Internet and web technologies allows ordinary users to generate vast amounts of data about their daily lives. On the Internet of Things, the number of connected devices has grown exponentially; each of these produces real-time or near real-time streaming data about our physical world. The resulting data, which is extremely difficult, if not impossible, to be stored, processed, and analyzed with conventional computing methodologies and resources, is referred to as the "Big Data." In this chapter, we focus on a subset of big data: digital data and analog data. These two major subsets are further divided as the environmental and personal source of data. We have also highlighted the data types and formats as well as different input mechanisms. These classifications are helpful to understand the active and passive way of data collection and production with explicit and without (i.e., implicit) human involvement. This chapter intends to provide enough information to support the reader to understand the role of digital and analog sources, and how data is acquired, transmitted, and preprocessed using today's growing variety of computing devices and sensors.

Original languageEnglish
Title of host publicationBig Data Analytics for Sensor-Network Collected Intelligence
PublisherElsevier Inc.
Pages3-20
Number of pages18
ISBN (Electronic)9780128096253
ISBN (Print)9780128093931
DOIs
Publication statusPublished - 8 Feb 2017

Keywords

  • Big data
  • Data acquisition
  • Data generation
  • Data management
  • Data storage
  • Sensing devices
  • User interfaces

Cite this