big data – Information Science

The first true form of data storage for machine use are punch cards. They were invented in 1725 and continued to be used commonly for centuries. They were cards with holes in them, and these holes represented instructions for the machine to follow. The most common uses for punch cards early on were textile looms and self playing pianos. These punch cards were easily understood by both humans and machines. By reading the documentation on the punch card, people could easily understand what the holes were supposed to do. For people to share this data, there would simply be increased production of the same punch cards. People use replicas of the same data to make machines carry out the same tasks. There was no need to go to computer science school to understand how these punch cards operate. Data storage was at a very elementary level, and there were no special languages required to extract the data and use it somewhere else.

It was not until 1948 that the first instance of RAM was introduced. Frederick Williams was able to store 1024 bits of information digitally. Intel started to release the first computer chip in 1966 that stored 2000 bits of information. Soon after external hard drives began to be made. Floppy disks and hard disks were introduced. However, even though there was more data able to be stored than ever before, it still was not easily transferable. To access this data, the physical drive had to be present, and the machines had to have a drive to access the data. There was no cloud where all of the information went. The data storage continued to improve. SSD cards and flash drives were invented in the 2000s to continue to make more storage and smaller physical chips possible.

In 2006 the term “cloud” was finally introduced. More data was being produced than ever before. Here is an interesting website that has a lot of interesting statistics about the amazing increase in data. By 2020 1.7 megabytes of information will be generated every second. This is the point in which data becomes difficult to access and to transfer. With the large amount of programming languages used it becomes harder to write universal programs manipulating the data. This is where XML files are necessary. The method of using XPath to access data from XML files is crucial to the sharing of this data.

Before information was accessible to everyone through a cloud, there was not as strong of a need to digitally transfer data between databases. However, now that it is possible for multiple people to access the same data, the question of how to efficiently manage it becomes increasingly more prominent. Data can be lost in a data lake, making it very difficult for anybody to access if it must be used. That is why it is important to have some way of transferring data into an XML. Once the data is organized in an XML format, the universally known XPath language can be used to access certain pieces of the information. Before data is generated, the software that generates it should have a way of storing that data into an organized store. This way the data can be utilized across different languages and machines. More data is about to be introduced into the world than ever before, and we cannot let it become lost.

-Nick Bagley

Tag: big data

Swimming in Data