Types of Data

Almost everything around us each day is data. Figuring out how it is useful to us, how to capture it, and how to put it to work is crucial. Data is just information about the world that we want to save and use later. Today it has been shown that so much of this information is useful in ways that evolve and grow every year.

Sources of Data

Data is all around us, its generated just by us existing and using our senses, we can purposefully collect and write down things about what's around us, or we can interact with the world and do experiments to produce data.

Sensory/Natural Data

The most basic forms of data have been collected for thousands of years. They arose naturally around us through our actions. From vocal recitations, cave paintings, to carved hieroglyphic tablets, music boxes and beyond. People have been recording whats around us to share with others for as long as they have been around.

The most basic forms of data are those that represent what we do. We see beautiful or scary things and want to share them, we tell stories and tales, we sing songs and tell jokes. We have developed ways to record these and play them back. As time has passed the ways we encode this data has gotten better and smaller.

Just as people see what is around us, photographs and images can be stored to preserve this data as well. People often take hundreds of photos to capture what their eyes see, photos of the landscape, photos of themselves, photos of their families. All of this storing visual data that we can later look back at. People also recorded where they went and how they got there. They created maps which encoded safe paths and then allowed other people to traverse these as well.

Here are some examples of the earliest types of data and the ways people recorded them:

How Sensory Data Was Encoded

Cavemen discover something amazing...

Technology evolved though and people got better at storing data. Paintings evolved to photographs which evolved to digital cameras with sd cards. Audio was able to be recorded first in mechanical music boxes, then records, and now digital recordings. Writings went from stone tablets to ink and paper to automated printing presses, and now computers and digital documents.

Measurement Data

Data is not just what we do. It is also what is happening around us. People started to measure what was happening around them. They recorded the weather, how hot it felt, how much plants grew, how many animals there were. They measured this data over time and found it was useful to look back and compare to years past.

As people got more value from this we developed better sensors to automatically collect this data. Earthquake measurements were done by spinning a drum of paper around and having a needle with a writing instrument at the end. When an earthquake shook the earth the needle would move making a mark across the paper. The data was thus recorded on that paper and we could later look back at it.

And again this is improved through computers. Instead of manual paper recordings, all this data is stored in computers which can miniaturize it, send it anywhere in the world in seconds and store millions of readings.

Sensors

Humans created sensors to capture this information automatically To capture temperature we created thermometers. To capture the amount fo humidity we created hygrometers. To capture wind speed we use anemometers. All of these sensors constantly capture information about the world around us.

For every natural cycle or event that happens repeatably there is useful data that is given off, and people have learned to keep track of it.

Not Just Weather

Measured data is not just about weather though. We capture the populations of animals over time, the migration patterns of birds, the movement patterns of planets in the sky and asteroids shooting across. Anything that can be observed we can capture data on.

Measuring The World

Sensors capture atmospheric conditionsTEMP68°FWIND0 mphHUMIDITY45%DATA STOREDT: 68°FW: 0mphH: 45%All data recorded
🌡️ Weather Temperature, wind, humidity

Experimental Data

Data doesn't just come from things that are already in our environment though. Humans can instead create conditions that they want to measure. Instead of observing an apple fall from a tree a scientist can drop 100 apples from the same height and measure the time it takes to fall. They can purposefully mix chemicals and watch how much they heat up or cool down, how long it takes, and what is produced.

The scientific method gave humans a framework for repeatably making conditions that produced useful data. If we have assumptions or hypothesis about what may happen in the world we can test these and collect that data.

  • Observation: Notice something interesting in the world
  • Question: Ask a question about what you observed
  • Hypothesis: Make an educated guess about the answer
  • Experiment: Design and conduct a test to check your hypothesis
  • Data Collection: Measure and record what happens during the experiment
  • Analysis: Look at the data to see if it supports your hypothesis
  • Conclusion: Decide if your hypothesis was correct or needs revision

We can produce 10 ads of a certain genre and test if those work better or worse. We can make a new product line and release it and compare the sales data to previous products. We can try 5 different fertilizers on plants and see which ones grow the fastest. Experimental data can be created by us to help us understand the world around us.

How Data Is Represented

The actual way that the data is represented can be different as well. The same time of data could have each datapoint be a different thing. We could collect information about the weather over time and we could collect it in 4 different ways. Someone could observe the weather outside and write "it was slightly rainy and cool", we could record a picture of how it looked outside, we could use sensors that actually check the temperature and record it as a number. There are many different ways we can record data.

Numerical Data

Storing values as numbers can be very useful when we want to do calculations with these numbers or feed them into computers. We record temperatures, we record dollar amounts, we record a 0 or 1 for if something happened or not. We can record many different forms of data as numbers. We can record transactions as numbers and dates as numbers as well.

Textual Data

Data can be stored as text as well, People have been writing text for thousands of years now. Text is how we store books and articles, and blog posts. It can be how we store speeches and meetings, contracts and orders.

Raw Data( Audio, Images, Video, etc.)

Data can be stored as images, audio, and video that can be later retrieved and used. A satellite stores pictures of the earths surface. A security camera stores videos of the surroundings and wether people entered or exited a premises. A camera takes images of a chemical reaction over times. Images are taken of breast cancer tissues under a microscope. The sound of a space shuttle taking of is recorded. We use a lidar sensor to take a reading of the shape of a machined part.

This type of data is stored in a raw binary format on modern computers, where we can store and retrieve it later.

How Data Is Stored

Temperature, measurements, currency, age, quantitiesTEMP68°DATA POINTSCURRENCY$0$$$AGE25AMOUNT0
🔢 Numerical Numbers & measurements

How Data Points Relate To One Another

The same type of datapoints can still be collected using different methods though. We may collect the data one time, we many collect it repeatedly over time, we may collect it for different locations, given different input amounts or for a set number of input values. Each different way we collect data gives us different insights into it and makes it useful in different ways.

Discrete data

Data can be stored as individual elements. Pictures, books, posts have information that can compare data points to one another but they are data that make sense on their own.

Network graphs

Given certain discrete data we can actually compare elements to one another and derive relationships between one another. Social networks are a great example of this as each profile on its own is a discrete datapoint but when we compare them together we can see how people are related to one another. Viewing peoples friends we can see the "distance" one person is from another as someones may be friends of friends with someone else.

Time Series Data

We can also collect data over time. Taking a single measurement of the temperature outside wouldn't mean much but if we take the temperature every hour over a full day we get more useful information. By plotting it against the time we get time series data.

If we collect the same data over many different times, it becomes time series data. Each day at 5pm we may collect the temperature, the amount of clouds, the speed of the wind. Thus we have a longer series of data over a period of time.

Spacial Data

Instead of going across time we can also collect data across different locations. If instead of taking the temperature in a single place each day instead of we take the temperature across many different places at the same time we get spacial data. This data is spread across space instead of time. We can visualize this data on a map and show how many places compare at a single point in time.

Tabular Data

Instead of going across time or space we can also move across "amounts". If we are experimenting with different chemicals we may add more or less of two chemicals. Each amount can have a specific amount and thus we can "travel" through a chart of amounts going to different "places". We could change the temperature in a fire, the pressure in a boiler, the amount of sugar in a recipe, the amount of money spent on a contractor. All of these move through a series of values instead of a series of time.

State Data

Sometimes our data only has a set few types that are neither number amounts or locations. If an item can only be red, green, or blue, then there is a set number of "states". A Package can be "ordered", "shipped", "damaged", or returned". A shirt can be small, medium, large. A person can be employed full-time, part-time, unemployed, or retired. We can collect data that can be one of a subset of certain states as well.

Data Relationships

Individual data elements: photos, books, posts
📸 Discrete Individual elements

Who Data Comes From

Individuals are not the only ones that keep track of data though. Organizations like governments and business also keep track of data.

Business Data

Businesses and organizations also produce a lot of data. Record keeping is data collection, tax records, employee payments, KPI goals. This is all data that businesses collect.

Businesses also keep track of data. How many orders were placed, how much money they collected, how many employees they have, how much they spend. They also want to keep track of who their customers and suppliers are, they write down who bought from them and who they sold to. They have key metrics that they try to hit, growth of their total revenue, average order value, amount of new clients reached per month. All fo this data helps them to make decisions and so they record it as well.

Government Data

Governments want to keep track of how many people are in their regions, how much tax is collected and from who, and other information like the age and types of people.

Surveillance Data

Governments also have started valuing surveillance data as well. People have set up camera networks that capture images of cars on the road, peoples faces. This data can be used to create maps of peoples locations and whereabouts, data for traffic flows or traffic jams, and can be used by law enforcement to track criminals or people they dislike.

Satellites also surveil the earth. There are use-cases tracking both individuals and large systems. They can be used to track clouds, the growth or death of plants, the freezing and melting of polar regions, the cutting of trees in forests. These satellites are also strong enough now to track individuals as well. Satellite imagery is powerful enough to track individual vehicles movements, and even people walking.

Not only visual data can be surveilled though. You can also keep track of information that is traveling in different formats as well. Radio signals can be captured using antennas, so we can know what information is being transmitted across radios. Internet records are kept and although thanks to modern encryption it cannot usually be known exactly what is sent we can see who is connecting to what address and at what time.

Research Data

Researches are also a big source of data. As we mentioned before, scientific data is very often collected. Research data is often neither business or government exactly. Its funding can be from either or both sources. Due to this much of the data is publicly available. For every question that humans and more specifically scientists have asked they have ran experiments for and tried to collect data on. They also use other datasets and collate them together rearranging them and analyzing them further.

Data like drug effectiveness can is collected. Data on learning methods and psychological practices is collected. Data on Building methods and structural stability and efficacy is collected.

Digital Data

Data also has changed with the popularization of the internet and computers. As we learned data went from being stored on paper to being stored in miniature devices which could store hundreds of millions of different pieces of data. People also began creating systems and applications which were all done through computers.

User Submitted Data

With the rise of people using the internet they started contributing data to it. They wrote posts on blogs and forums, they upload images to social media, they write tweets and facebook posts and upload images they drew. They ask questions and answer questions and post things that are funny or sad or happy. So much information is submitted to the internet by these users.

Data Obtained From Users

It's not just what people personally submit though. As they use applications and services they also create data, What they search is recorded, how long they use a specific website for, how long they look at a video, whether they like something or not, where they click and how often they click. All of this is data that is created through their use of things.

Combined/Derived Data

This data can be further enhanced though using computers. We can look at different data that users have created and combine it together to see how it compares to one another. Given this we can create different grouping and networks that when combined together give us more usefulness. "Graphs" can be make that interconnect people. For example we can look at people friends lists on facebook and by comparing their friends to their friends friends we can see how closely related someone is or isn't. Or we can see what sites someone goes on and compare it to what they have bought in the past and then group people into different types of consumers. Then using this data we can assume that people who fall into the same groupings may like the same things so we can suggest for them to purchase items that others in this group have bought in the past. Other