Skip to Main Content
  • Library AND Information Service


This guide provides information related to Stellenbosch University's institutional research data repository.

Preparing your data for publication

  • The quality of processed data can be assured through the use of appropriate data entry/capture and data checking techniques.
  • Data checking techniques encompass the editing, cleaning, verification, cross-checking and validation of data.
  • Ideally, file formats should be open, lossless and standard in nature. 
  • Where this is not possible widely used file formats should be relied upon.
  • Detailed guidance relating to recommended file formats can be found on in the Library of Congress Recommended Formats Statement.

What is data documentation?

Any digital supplementary context files such as README.txt files, laboratory notebooks, data dictionaries, metadata schemas, standards, administrative materials, codebooks, user manuals, workflows, protocols, technical specifications or methodologies which explain the production, provenance, processing or interpretation of research data.

Minimum documentation required

Research data must include at least one documentation file which provides a brief description of the data. Although any one of the different types of documentation files can be uploaded onto SUNScholarData it is recommended that a README.txt file accompany the research data. A README file template is provided for use. For more information on README files please view the README file guideline.

Data can be organised in accordance with specific folder organisation techniques as well as file naming conventions. These techniques and conventions should be outlined in README.txt files.

Folder organisation techniques

Files are organised through the use of folders. These folders can be organised through several methods namely:

  • project
  • researcher
  • date 
  • research notebook number
  • sample number 
  • experiment type
  • instrument
  • data type
  • any combination of these aforementioned methods

File naming conventions:

Indicate which information you will include in your file names. Good file names represent a combination of the following information:

  • experiment type 
  • experiment number
  • researcher name or initials 
  • sample type
  • sample number
  • date
  • site name

What is metadata?

  • Data that plays the role of documentation for data/resource description, contextualisation and discovery.
  • A metadata record must be created for each dataset. 

What is a metadata record?

  • A structured description of a dataset which enables the identification, discovery, interpretation, use and administration of a dataset or collection. Metadata records can refer to datasets held in SUNScholarData or elsewhere.
  • A mandatory minimum amount of metadata must be supplied in order to create a valid metadata record.
  • The mandatory minimum metadata that must be recorded for each dataset or collection metadata record are:
  1. Title
  2. Author(s)
  3. Categories
  4. Keywords
  5. Description (relating specifically to the research data)
  6. License
  7. URL/DOI (for datasets held in external repositories)
  8. Academic group
  • Research data may be considered sensitive based on either one of the following grounds:
  1. Disclosure restrictions imposed by research contracts
  2. Prejudicial information that could cause harm if released publicly
  3. Possibility of copyright infringement
  4. Patentability of the research data
  5. Potential for commercialisation
  6. Ethical considerations
  7. Data privacy requirements (in instances where data contains personal identifiers)
  • Generally speaking sensitive data would not be published on SUNScholarData.  
  • Prospective data publishers should have the legal permission to distribute the research data in their personal capacity or on behalf of all of the relevant rights-holders.
  • Where research data is subject to copyright protection the use of material must fall within the ambit of the fair dealing legal doctrine as stipulated in the Copyright Act 98 of 1978
  • If the research data collected by a researcher from a third-party had already been assigned a license the researcher must adhere to the requirements of this license when publishing the data.    
  • Where applicable intellectual property clearance must be obtained from Stellenbosch University’s Technology Transfer Office - Innovus.

Access settings will have to be assigned to research data. The research data published on SUNScholarData may have the following access settings:

  • Open Access Setting - Openly accessible without restriction.
  • Embargo Setting - Research data held under embargo for a fixed period, and made available on an open basis upon the expiry of the embargo period.

A license (associated with the research data) should be selected in order to stipulate the manner in which others may use the data. Data and software files published on SUNScholarData will be made available under the terms of Creative CommonsOpen Data Commons or standard Open Source licenses.

SUNScholarData's recommended citation format is based on the DataCite Metadata Schema. The citation format itself is influenced by the research data's item type (dataset, figure, media, software or data management plan). Below is the recommended citation format: 

  1. Dataset

Author (PublicationYear): Title. SUNScholarData. (Dataset). SUNScholarData DOI

  1. Figure/Image

Author (PublicationYear): Title. SUNScholarData. (Figure). SUNScholarData DOI

  1. Media (Audio, visual or audiovisual)

Author (PublicationYear): Title. SUNScholarData. (Media). SUNScholarData DOI

  1. Software

Author (PublicationYear): Title. Version. SUNScholarData. (Software). SUNScholarData DOI

  1. Data management plan

Author (PublicationYear): Title. SUNScholarData. (Data management plan). SUNScholarData DOI

Please note that in the case of software it may be desirable to include information from optional properties such as the Version.