The advent of data in the Digital Age has been a mixed blessing for researchers. Data is available easily on just about any topic you look for, if you know how to look for it! Big and small, immediate and projected, general and specific; it’s all there for the using when you have the right access to the information.
But sometimes too much information can be just as difficult to deal with as not having enough. This is particularly true if your organization of what you do have leaves something to be desired. Organizing your datasets to achieve your greater research objectives is important. As you collect your data, how you tag it, where you store it, and what your overall structure is are all key components to being able to use your information optimally.
Big Versus Small Data
The type of data you need to collect will depend on the project you’re working on. Whether you’re looking for big or small data makes a huge difference in how you search and organize your datasets.
If your research outcomes are being used to determine decisions that need to be made for in-the-moment or near future answers, you will be using small data to come up with your outcomes. Small data is any information that is obtained in real-time or is currently in the process of being collected. It is easily stored in a spreadsheet format and is quickly obsolete and archived. Small data is found through applying preset attributes that apply to what your research is about, and it’s dug out through the strategic mining of big piles of data. With small data analysis, the labeling of the attributes you are trying to analyze is crucial to getting the right information to help you make in-the-moment decisions that drive your next step.
Big data, on the other hand, is the term that describes all of the structured and unstructured knowledge collected in voluminous storage piles. Mining through all of this information requires a lot of sifting through data that might or might not be relevant and meaningful to your work. Big data is frequently used to attempt to predict future trends and make major business decisions.
The Problem With Too Much Data
Big and small data are both important sources of knowledge for researchers, but sometimes too much data is a problem. Having information at your fingertips but not being able to sort through it with the right tools or using the right attributes to sift through it all makes all of the data worthless to your study.
If you need immediate, in-the-moment information, small data is your answer - if you know the right labels to search for. This doesn’t require casting a net quite as large as a big data search would, but both types of data compilation rely on you having the right tools at your disposal and using them in organized manners.
Since so much data can be stored, mined, and accessed for further analysis, there is a major dependence issue where researchers rely strongly on the tools that are being used to input the requests and the labels that are applied in the search. From there, the challenge becomes determining if any, all, or none of the collected response is relevant to the decisions they need to make and then organizing those datasets further.
How to Organize Your Datasets
Organizing your files in a consistent way that makes sense to you and to anyone using your format is important. That way you can quickly put your fingers on the information you need. Putting together your organizational system should be determined prior to the start of your project, and should be agreed upon by everyone in your team.
Make sure you have a system that you can access from anywhere you need it but is also secure enough to prevent being hacked into by someone else. Always avoid labels that are similar to prevent accidental duplication. Use these tips to organize your datasets:
● Start with folders, grouping them into logical orders so your information is split up per topic and easily found.
● Make sure you’re following any requirements that your institution or department requires.
● Don’t reinvent the wheel. If there are already procedures that exist that you can piggyback off and follow that work for your research, use them!
● Use common sense names for your folders unless you’re trying to store sensitive data. The name should relate to the information found within them.
● Use a consistent naming method for all of your folders and limit how many main folders you have. You can create folders inside your main storage labels that help you to organize the information inside each section, but your key folders should be as limited as possible.
● Keep your inbox organized, too, with folders that separate your to-do work with your completed work.
● Purge your active folders when you no longer need files, storing them in archived locations instead. Always back up your files regularly, even if you have them on a cloud or network.
After you’ve finished a project, don’t leave your datasets and folders sitting in your active area. Schedule a time to go through everything and archive what you’re not going to need again immediately.