OpenAI's ChatGPT, trained on extensive textual datasets from the internet, serves as a prime example of how research data can yield remarkable outcomes. It utilizes knowledge of many openly availabe machine learning models, and was trained on huge datasets scraped from the internet. However, the recent lawsuits faced by AI companies underscore the need to carefully consider how data is shared and repurposed. A crucial question arises: Do you want your research data to be used for purposes unrelated to its original intent?
Enabling Accessibility and Reusability
If your answer to the above question is affirmative, there are several steps you can take to ensure your research data is better accessible and reusable:
1* Publishing Datasets on Dedicated Platforms: Make use of dedicated platforms such as Zenodo to publish your datasets, ensuring clear metadata and improved discoverability. By providing comprehensive information about your datasets, you enhance their visibility and accessibility to the wider research community.
2* Support Initiative for Open Abstracts through your institution. It promotes the availability of freely accessible abstracts and articles beyond the confines of specific publishers' portals.
3* Sharing Research Code: Share your research code on platforms like GitHub, enabling transparency, reproducibility, and reusability. By making your code accessible, you empower other researchers to validate and build upon your work, fostering a culture of collaboration and advancement.
Steps like these represent valuable research outputs that go beyond the original intentions and scope of the research.
Open Data at Impacter
At Impacter, we recognize the value of open datasets and actively use them and try to contribute to them. For example, to facilitate comparisons of research proposals to previous, we leverage CORDIS. To enhance our search algorithms, we utilize the Synergy dataset, comprising of fully labeled systematic review datasets. OpenAlex serves as a source of bibliometric information of research output. We try to give back by contributing to the codebase of ASReview, the organisation behind the Synergy dataset, and by reporting bugs and suggesting improvements to OpenAlex.
The manner in which research data is shared and utilized can have a profound impact on the research community and beyond. It is crucial to carefully consider how data is stored, shared, and licensed. By embracing openness, accessibility, and reproducibility, researchers can amplify the impact of their work.