Hello! I am Baharak Bakhtiari. Since May 2018, I’m working on my Masters Thesis at Utrecht university with Tijmen Altena and Marco Spruit (my current supervisor). We agreed to start my master thesis on the topic of Data Reusability and the Data Management paragraph. Impacter offers the environment to learn about text mining techniques and apply them to research paragraphs I will gather myself.
In grant applications, more and more funders include a Data Management Paragraph. In the paragraph, a researcher needs to elaborate on how they are planning to store their data before they start their research.
In my thesis, I analyze the data in this paragraph with text mining techniques such as different families of Natural Language Processing (NLP) techniques. The idea is to measure quality of the text in this paragraph and provide feedback for researchers. Hopefully, this feedback will help researchers to generate more reusable data in the research process. Reusable data means data are collected and generated in one research, can be reused in other research procedure. Data produced in reusable manner will save significant time and budget for researchers.
Not only NWO, in the Netherlands but also other funding agencies around the world are implementing strategies and polices to promote data sharing and generate reusable data in the research. For example, the Data management plan is required by European Commission for Horizon 2020. The outcome of my thesis is expecting to be a tool to measure data reusability and support researcher with more accurate feedback on this specific matter.
My task is therefore to tailor a new set of algorithms based on Data Management Paragraph requirements. These requirements are to make data Findable, Accessible, Interpretable and Reusable (the so called FAIR principles). FAIR principles are community standards to produce reusable data, but they are not a measurement for data reusability. In this thesis we will therefore attempt to operationalise these requirements in the text data and measure them to provide accurate feedback on reusability of researcher’s data management paragraph. Mining the content of a DM paragraph and expand the information to reusability metrics is the agenda of my research.
Would you like to help me with my Thesis? I’m always looking for Data Management Paragraphs to analyze. Please send me an e-mail if you can contribute.