Table of Contents
MCh NOMAD Oasis
Responsible persons:
- Michal Fečík (fecik@mch.rwth-aachen.de)
- Ondřej Fikar (fikar@mch.rwth-aachen.de)
MCh NOMAD Oasis is a system for archiving and management of calculated data, specifically, results produced by DFT codes such as VASP, OpenMX, or Wien2k as well as various other packages, i.e., LOBSTER or phonopy. GUI interface to the MCh NOMAD Oasis is at https://oasis.mch.rwth-aachen.de/nomad-oasis/gui/.
MCh NOMAD Oasis is based on the upstream NOMAD archive project https://nomad-lab.eu/prod/rae/gui/.
Access to the MCh NOMAD Oasis
The current access is allowed only from the MCh network or using the MCh VPN. Authorization is required as well. Please register a NOMAD account by clicking on the LOGIN/REGISTER link in the top-right corner in the GUI. For security reasons, only persons with MCh email are allowed the access to MCh NOMAD Oasis - your registration email must be the MCh one.
Besides being able to upload your own data, you can also search and re-use all the already published data of other users.
Uploading data
For upload to Oasis, please go to https://oasis.mch.rwth-aachen.de/nomad-oasis/gui/uploads and upload your DFT calculations archive (in zip or tar format). Besides the GUI interface, one can use a command line to upload directly from the cluster, see the curl example line on the aforementioned uploads page. For more command line upload examples, click on the “alternative shell commands” button next to the curl line.
Preparing files for the upload
The parsing of the upload to extract the metadata is depending on specific mainfile being present. This is usually the main output file of the specific software, your calculations will not be detected properly without it and hence is the minimum that you need to upload. However, it is best to upload all reasonably large output files as they were produced by the calculation.
The only files which don't need to be uploaded are large binary and temporary files. Such files are usually only useful during the calculations for restarting or doing some follow up calculations (for example LOBSTER needs VASP's WAVECAR file but it is no longer useful after LOBSTER is done). Some examples of large files with limited archiving value for selected codes follows.
If you did any post-processing or analysis based on the data, upload the files, such as excel sheets, figures based on the data or presentations together with the data, preferably in the top-level directory of the archive. This will make life easier for anyone trying to reuse or understand your calculations later.
VASP
Mainfile: vasprun.xml or OUTCAR
WAVECAR and CHGCAR files can be large large and are not usually needed after the calcation is finished so try to delete them before upload. You can for example search and delete recursivelly all WAVECAR files in a specific directory using find <directory_name> -name 'WAVECAR' -delete
.
OpenMX
Mainfile: *.out
*.cube files are output files with the grid settings, potential and charge density and are usually large and not worth archiving. *_rst/ restart directory is another large cleanup candidate.
LOBSTER
Mainfile: lobsterout
If you use the saveProjectionToFile keyword, than the produced projectionData.lobster could also be quite large and not so useful anymore.
Data parsing
After the upload, the whole archive will be scanned for calculations and metadata will be extracted automatically. This includes the system (composition, atomic positions, structure, system type), method (used software, version, functional, relaxation type) as well as the specific results (calculated energies, forces, stress-tensor, DOS, etc.). If some of your calculations are not detected at all, or if some important property which should be parsed is not, report this to the responsible person.
Author (user editable) metadata
While Oasis parsers can extract the metadata of the calculations, it is currently not possible to guess the user intention. I.e., we can detect that you uploaded few calculations with the same compositions and various volumes and extract the calculation metadata and quantities, however there is no way of knowing that this was for example used to extract the Bulk modulus using the Birch–Murnaghan equation of state. So please try to be verbose in the comment field in this regard. If you are uploading already published data, add a link to the manuscript. If there were multiple persons contributing to the calculations, mark all of them properly. You can also join multiple uploads together into a dataset.
For some examples of how user-edited metadata could look like see for example: https://oasis.mch.rwth-aachen.de/nomad-oasis/gui/uploads/entry/id/VFiP3BXJTwyVpAVgpjv31w/kb7mbeDAvAxioVrqvQg4gRKT9hvu or https://oasis.mch.rwth-aachen.de/nomad-oasis/gui/uploads/entry/id/cT4flQfRQtOqZLqks2oIbA/XorY5hGwes7BcVuudPKvC7geBmBY
Publishing
When you have the proper user-edited metadata in place, publishing the upload will make it visible to all Oasis users. Please note that you can still edit the user-edited metadata after publishing.
Uploading to central NOMAD
If you want to share your data with people outside of MCh, our Oasis is not the proper place to use due to the VPN and account protection. You can however publish your uploads to the central NOMAD archive at https://nomad-lab.eu/prod/rae/gui/. This is possible for uploads published in MCh Oasis with one click from the uploads page. This is the recommended procedure for data used in a publication. In central NOMAD you can generate a DOI for your data and include it in your manuscript (you can also link your manuscript from NOMAD). Please note that when uploading to the central NOMAD archive, your data will be licensed under the Creative Commons Attribution license (CC BY 4.0) https://creativecommons.org/licenses/by/4.0/ (in short people can share or adapt your data as long as they give your credit).
Getting help and reporting problems
If you encounter any problem at any stage of the process, don't hesitate to contact the responsible persons. There is also a lot of documentation regarding the upstream NOMAD archive project https://oasis.mch.rwth-aachen.de/nomad-oasis/docs/index.html, however not everything is relevant for MCh Oasis.
There are youtube videos about using NOMAD https://www.youtube.com/c/TheNOMADLaboratory, for example about searching in our Oasis https://www.youtube.com/watch?v=nKJ5PrCW61w but also relevant for usage of other NOMAD tools on top of the MCh NOMAD Oasis, such as the AI-toolkit.
Known bugs
- Downloading the whole upload back is currently not possible. One can donwload all of the detected calculations, including all other files present in the detected calculation directories, but other files in the upload are accessible only from the administrator interface. This will be hopefully fixed by upstream project soon: https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/514
- Parsing of the calculations should be in general quite fast, around 1 second per calculation/directory. However if the specific mainfiles are too big, parsing could need a lot of memory, more that the current server has. Hence the swap file will be used and everything will be slow. This can happen most often for with large vasprun.xml files https://github.com/nomad-coe/nomad-parser-vasp/issues/12.