.. _batch:

Batch Processing - (IFREMER Datarmor HPC)
=========================================

The `ResourceCode`_ hindcast archive has been developed using the `IFREMER`_ `Datarmor`_ HPC services. 
The hindcast data set was validated using the `WaveVal`_ tools implemented on Datarmor. This page 
provides information on how to set up and run the `WaveVal`_ tools on Datarmor, and how to generate 
and run a set of PBS batch jobs to automate the validation process. To use the Datarmor HPC service 
the user requires an IFREMER intranet login account, and an extranet login to remotely access the 
services.

Useful information on using the Datarmor services can be found at 
`<https://m.davidmkaplan.fr/cluster/cluster-use-instructions.html>`_.

The methods described here should port to any HPC systems that supports the python anaconda environment and 
runs a PBS job submission service. 


.. _ResourceCode: https://resourcecode.ifremer.fr/
.. _WaveVal: https://git.ecdf.ed.ac.uk/uoe-ies-open-tools/waveval
.. _IFREMER: https://wwz-ifremer-fr.translate.goog/L-institut?_x_tr_sl=fr&_x_tr_tl=en&_x_tr_hl=en&_x_tr_pto=sc
.. _Datarmor: https://wwz-ifremer-fr.translate.goog/Recherche/Infrastructures-de-recherche/Infrastructures-numeriques/Pole-de-Calcul-et-de-Donnees-pour-la-Mer?_x_tr_sl=fr&_x_tr_tl=en&_x_tr_hl=en&_x_tr_pto=sc

Setup conda environment
#######################

Access to  Datarmor is via login node. On this node the user enters a bash shell by default. The anaconda 
python interface is accessed by changing to a conda shell using:

.. code-block:: bash

  $ source /appli/anaconda/latest/etc/profile.d/conda.csh

Then the local conda runtime configuration file ( ~/.condarc ) needs to be created containg the following lines:

.. code-block:: bash
  :caption: .condarc

  envs_dirs:
  - $DATAWORK/conda-env
  - /appli/conda-env
  - /appli/conda-env/2.7
  - /appli/conda-env/3.6
  pkgs_dirs:
  - $DATAWORK/conda/pkgs

This file needs to be in the root of the users home directory. It can be created using any suitable text 
editor (*e.g.* vi, nano, *etc.*).

Once the ~/.condarc file exists, a conda enviroment can be created including the python packages required 
to run the validation tools. For the ResourceCode hindcast validation the conda environment ``buoyvalid`` was created 
using the command:

.. code-block:: bash

  $ conda create --name buoyvalid

where the option --name is used to set the name of the conda enviroment created.

To check that the conda environment was generated use the command:

.. code-block:: bash

  $ conda info --envs

The required python packages can be added to the ``buoyvalid`` conda environment as follows:

.. code-block:: bash

  $ conda install --name buoyvalid numpy
  $ conda install --name buoyvalid netCDF4
  $ conda install --name buoyvalid astropy
  $ conda install --name buoyvalid matplotlib
  $ conda install --name buoyvalid cartopy
  $ conda install --name buoyvalid spyder
  $ conda config  --add  channels  conda-forge
  $ conda install --name buoyvalid cartopy_offlinedata

To use the conda environment it needs to be activated using the command:

.. code-block:: bash

  $ conda activate buoyvalid

Similarly, to close the conda environment use the command:

.. code-block:: bash

  $ conda deactivate

Construct validation process scripts
####################################

.. _CMEMS: https://marine.copernicus.eu/
.. _InSituTAC: http://www.marineinsitu.eu/dashboard/

The processing script needs to be constructed in such a way that it can be called from a PBS batch 
job script and option parameters can be passed to control how what the process does. The script 
``datarmor_insitutac_validate.py`` was used to process the ResourceCode hindcast data set against the 
`CMEMS`_ `InsituTAC`_ wavebuoy data archive. The main function is called when this python script is run, 
and the optinal parameters are parsed prior to calling the validation process. This script is designed 
to process a single wavebuoy location, the information about the buoy is taken from a record in a CSV 
file that provides a full set of unique model/wavebuoy data matches for a specific year and month, 
*i.e.* each record in the CSV file corresponds to a unique wavebuoy.

.. literalinclude:: ../examples/datarmor_insitutac_validate.py
  :lines: 1-12
  :caption: datarmor_insitutac_validate.py

The header of the script sets the shebang line to idenfiy the program required to run the script, and 
imports the required modules and module functions. 

The first part of the main function parses the user input options used to control the process.
There are three required parameters:

    +---------------+-------------------------------------------------------------------------------------+
    | Option        | Description                                                                         |
    +===============+=====================================================================================+
    | -p, -\-path   | Path for validation results output                                                  |
    +---------------+-------------------------------------------------------------------------------------+
    | -f, -\-file   | CSV file of dat match records, **this must be in the directory defined by -\-path** |
    +---------------+-------------------------------------------------------------------------------------+
    | -n, -\-recnum | Record number in the CSV file to process                                            |
    +---------------+-------------------------------------------------------------------------------------+


and three optional parameters:

    +--------------+----------------------------+---------+
    | Option       | Description                | Default |
    +==============+============================+=========+
    | -\-minYear   | Minimum year to process    | 1994    |
    +--------------+----------------------------+---------+
    | -\-maxYear   | Maximum year to process    | 2019    |
    +--------------+----------------------------+---------+
    | -\-genPlots  | Generate plots flag        | False   |
    +--------------+----------------------------+---------+

.. note::
  The minimum year **must not be less than** the first year in the model hindcast archive, and the 
  maximum year **must not be greater than** the last year in the model hindcast archive.

.. literalinclude:: ../examples/datarmor_insitutac_validate.py
  :lines: 13-51

Once the input parameters are parsed, the input record list CSV file is converted to a list of tuples 
for processing by the :ref:`Validation` module, the requested year range is set, and the wave data 
format defined.

.. literalinclude:: ../examples/datarmor_insitutac_validate.py
  :lines: 52-66

Next the wave parameters to be processed are select using an index list based on the following values:

    +--------+---------------------------------------+
    | Index  | Integrated Wave Parameter             |
    +========+=======================================+
    | 0      | Hm0 - Significant wave height         |
    +--------+---------------------------------------+
    | 1      | Tp - Peak wave period                 |
    +--------+---------------------------------------+
    | 2      | Tm02 - Mean zero-crossing wave period |
    +--------+---------------------------------------+
    | 3      | Dir - Peak wave direction             |
    +--------+---------------------------------------+
    | 4      | Spr - Peak wave directional spreading |
    +--------+---------------------------------------+

Each integrated wave parameter is processed separately, so the selected parameters are processed in a loop.

.. literalinclude:: ../examples/datarmor_insitutac_validate.py
  :lines: 67-83

The selected wave parameters then need to be mapped to the corresponding netCDF variable names within 
the model and wavebuoy data files. The model variable name is set in the variable ``mVarName`` and takes a single string value, the wavebuoy variable names are set in the list variable ``varOptions``. There are a range of similar wave parameters in the buoy data that could be used in place of the preferred variables, the list allows these to be used if the preferred is not available.

.. note::
  It is recommended that the user only take the preferred wavebuoy variables for validation purposes.
  [VHM0, VTPK, VTM02, VPED, VPSP].
  
  The WaveWatch III model ouputs peak and zero-crossing frequencies, these are converted to periods 
  during the record processing stage.

.. literalinclude:: ../examples/datarmor_insitutac_validate.py
  :lines: 84-100

The final steps are to calculate the validation statistics by calling :py:func:`Validation.validate_records`. This returns 
a count of the number of validations returned (``n_valid``) and a dictionary structure (``valid_stats``) containing the 
validation results. If there is at least one validation result, then the results are saved to both an ASCII and a binary 
file for post-procesing. A separate record is produced for each integrated wave parameter for the current wavebuoy being 
processed.

.. literalinclude:: ../examples/datarmor_insitutac_validate.py
  :lines: 101-115

The interface section allows the script to be run from the command line or to be called from within another script. This feature is used to automate the processing of a large number of wavebuoy locations using the methods described in the following section.

.. literalinclude:: ../examples/datarmor_insitutac_validate.py
  :lines: 116-120

To run the script from the command line use:

.. code-block:: bash

  $ python3 datarmor_insitutac_validate.py -p path/to/output/directory -f records_file.csv -n record_nmber

e.g.

.. code-block:: bash

  $ python3 datarmor_insitutac_validate.py -p ../data/VALIDATION/RSCD_v3 -f validation_site.csv -n 9

will write the output to the absolute directory ``../data/VALIDATION/RSCD_v3``, the input file containing the list of 
unique model/observation matches is ``validation_sites.csv``, and record number 9 is to be processed (it must be 
remembered that python counts from 0 not 1, *i.e.* -n 0 processes the first record in the file, -n 9 processes the 
10 :superscript:`th` record in the file).

.. note::
  For this script to be callable from another function its permissions need to be set to executable,
  *e.g.* apply ``chmod a+x datarmor_insitutac_validate.py`` to the script.

Generate PBS job scripts and submit to queue
############################################

To run the ``datarmor_insitutac_validate.py`` processing script on a Datarmor processing node it needs to be submitted 
to the job queue using a qsub call. The ``datarmor_insitutac_validate.py`` needs to be run in an anaconda shell environment, so this needs to be set up in the script submitted by the qsub call. Within the script the compute node 
resources need to be requested (*i.e.* the amount of memory required and the time required to run the process). It should be noted that if insufficient resources or walltime are requested the job will end without completion, so it pays to over-estimate the resource requirements, to avoid needing to rerun a process. 

To automate the processing of a large number of wavebuoys, a separate PBS script is required. The approcach taken to 
facillitate this is to define a base PBD script that contains the common components, and can be modified to provide the specific information for each job. 

The following PBS job script (``base_insitutac.pbs``) was used as the basis for generating the individual job submissions:

.. literalinclude:: ../examples/base_insitutac.pbs
  :linenos:
  :caption: base_insitutac.pbs

Line 1 sets the shell environment, lines 3 and 4 request the compute node resources, line 6 changes to a conda shell, 
line 7 activates the conda environment ``buoyvalid`` (as described above), line 8 changes to the location of the python 
scripts on the users datawork space, and line 9 gives the common components of the call to run the 
``datarmor_insitutac_validate.py`` script. Line 9 is missing the required -n input option, so this will return an error 
if submitted as is. The following script used to generate the PBS jobs adds the required record number to the call.
 
.. note::

  The conda environment **buoyvalid** needs to be replaced with the environment you have set up.

  You must replace **USER** must be with your user name, **pycode** with the location of your python scripts, 
  **VALIDATION/RSCD_v3** with the location you want the results to be output too, and **validation_sites.csv** with the 
  name of the CSV file used for input.
  
  The output directory must exist, and the records CSV file must exist in the output directory.
  
The script ``generate_validation_batch_jobs.py`` is used to generate and submit a PBS job for records in the CSV matched 
data records file. The default PBS script ``base_insitutac.pbs``, described above, is modified to match the record being 
processed. It is assumed that each record in the CSV file represents a unique wavebuoy, the data processing script called 
generates the set of separate year/month records required for processing a given buoy.

The only line in ``base_insitutac.pbs`` that needs to be modified is the last line which calls python processing script; 
the record number to be processed is added as in input option to the call. A new PBS job script is generated with the record number appended, then submitted to the job queue using the ``qsub`` command.

A subset of the records can be processed by setting the ``start_rec`` and ``num_recs`` values in this script before running it. The ``num_recs`` must not be greater than the number of records in the CSV file. This feature can be used to re-run processes that failed. 

.. literalinclude:: ../examples/generate_validation_batch_jobs.py
  :linenos:
  :caption: generate_validation_batch_jobs.py

Post-process validation results
###############################

.. literalinclude:: ../examples/process_insitutac_stats.py
  :linenos:
  :caption: process_insitutac_stats.py