Skip to Main Content

Text and Data Mining

A How Do I Guide that covers resources for Text and Data Mining through the University Library

Ethical and responsible research

Responsible research

Researchers at the University of Adelaide are expected to abide by the principles outlined in the Australian Code for the Responsible Conduct of Research (2018). There are two particularly important principles outlined in this code to consider when performing text and data mining research:

  • Transparency in declaring interests and reporting research methodology, data and findings. When it comes to TDM, this means clearly stating the methods and tools used, and correctly acknowledging and citing original sources, software, and data.
  • Accountability for the development, undertaking and reporting of research. It is important to “comply with relevant legislation, policies and guidelines” regarding all aspects of research including TDM. As a student or researcher at a university, this involves university policies and the provisions outlined in publisher licenses.

The ACRCR also outlines the responsibilities of researchers in relation to these principles. The following responsibilities are especially important to consider when performing TDM research:

  • Comply with the relevant laws, regulations, disciplinary standards, ethics guidelines and institutional policies related to responsible research conduct. Ensure that appropriate approvals are obtained prior to the commencement of research, and that conditions of any approvals are adhered to during the course of research.
  • Retain clear, accurate, secure and complete records of all research including research data and primary materials. Where possible and appropriate, allow access and reference to these by interested parties.
  • Cite and acknowledge other relevant work appropriately and accurately.

The University's Responsible Conduct of Research Policy emphasises the responsibility of researchers to familiarise themselves with the ACRCR and to comply with the principles and responsibilities within, along with any relevant University procedures or guidelines. Further information on research integrity at the University of Adelaide is also available.


Ethical research

When collecting or reproducing text and data from any source, it is important to practice ethical behaviour. This includes data from the Internet, such as social media content and other forms of online communications. Even if there are no licensing or copyright restrictions, there are still important ethical considerations involved. Consider issues such as unwanted exposure of sensitive and personal information, protection of anonymity, and privacy and confidentiality.

Text and data gathered from the Internet is considered secondary use of data or information and falls under human research. When using these kinds of materials for the purposes of TDM, researchers must abide by ethical guidelines, particularly those outlined in the National Statement on Ethical Conduct in Human Research 2007 (updated 2018). It is a requirement for all students and researchers associated with the University of Adelaide to determine the level of ethical review of any human research project, and to obtain approval from the University's Human Research Ethics Committee when necessary before commencing research.


Policies and documents

Acknowledging and citing TDM research

All resources used in the process of TDM must be acknowledged and cited correctly. This includes original texts, data sets, and statistics, as well as resources used to clean and prepare text and data such as stop word lists. Additionally, it is good practice to acknowledge and cite software and any other tools used to perform TDM. This improves the transparency and credibility of your own work, ensures that sufficient credit is given to all authors and sources supporting your research, and supports readers to access resources.

Referencing support is available to all University students and researchers, including various style guides. If you are unsure of how to cite a particular element of your TDM research, you can contact the library for further assistance.