6. Python

6.1. Available Python Version

Python 3.8.X is installed on this system (python3.8 command)。
If you want to use a different version of Python, please use pyenv.

6.2. Install of pyenv

The following is an example of installing and setting pyenv in Bash.

Installing pyenv
git clone https://github.com/pyenv/pyenv.git ~/.pyenv
cd ~/.pyenv && src/configure && make -C src
Setting up the shell environment (If you are using Bash)
echo 'if [ -z $PYENV_ROOT ]; then
  export PYENV_ROOT="$HOME/.pyenv"
  export PATH="$PYENV_ROOT/bin:$PATH"
  eval "$(pyenv init --path)"
  eval "$(pyenv init -)"
fi' >> ~/.bashrc

Note

The above settings should be in .bashrc, not .bash_profile. The .bash_profile setting is not read by mpirun.

After doing the above, please login again to read the shell settings.
Then you install and configure the Python version you want to use.
Installing and configuring Python
# install will take a few minutes to do the source build
pyenv install 3.9.10
# Specifies the Python version to use under the current directory
pyenv local 3.9.10
python -V  # -> Python 3.9.10

6.3. Overview of MPI Parallel Computing in Python

When you run a program with mpirun, the program you want to run runs on multiple processes (multiple compute nodes) at the same time. Each process is assigned a sequential number, called a rank, starting with 0.
Without MPI control in the program, each rank (each process) will run the same program and behave the same way. By obtaining the rank number in the program and describing different processing depending on the rank number, it is possible to have different behavior for each rank.

mpiQulacs attempts to speed up quantum circuit simulation by distributing and calculating quantum state vectors and gate operation information required for quantum circuit evaluation to each rank.

6.3.1. Notes on programming

6.3.1.1. Determinism of program behavior

Basically, the program needs to act decisively, unless you intentionally want each rank (each compute node) to behave differently.

The same program code is executed for each rank unless parallel control using MPI is intentional. When a program performs a non-deterministic operation, for example, a different quantum circuit is generated in each rank, and the calculation processing of the quantum state vector starts in that state. If this happens, data cannot be synchronized and exchanged correctly between ranks, leading to program crashes and abnormal results.

For example, if the rotation angle of the RY gate is specified nondeterministically in the following code, the rotation angle in the quantum circuit existing on the rank 0 memory and the rotation angle in the quantum circuit on the rank 1 are shifted. Since the contents of quantum circuits assumed by each rank are different, inconsistencies occur in the calculation of state vectors arranged across ranks, leading to program crashes and abnormal results.

import numpy as np
import random
from qulacs import QuantumCircuit, QuantumState
from mpi4py import MPI  # required for using mpiQulacs

circ = QuantumCircuit(3)

# N.G. (non-deterministic)
# angle = random.random() * np.pi

# O.K. (deterministic)
random.seed(1234)
angle = random.random() * np.pi

circ.add_RY_gate(0, angle)

state = QuantumState(3, use_multi_cpu=True)
circ.update_quantum_state(state)
samples = state.sampling(1024, 2022)  # n_shots=1024 and seed=2022

If you use random numbers, you should explicitly set the seed value to make the program behave decisively. In addition, you should configure the various Python packages, if any, to make them behave decisively.

6.3.1.2. Data output

When a character string is output to the standard output by the print method or the like in a program, the output results of all ranks are displayed in the calculation node where mpirun is executed (rank 0). For example, if you run mpirun with 16 compute nodes, you will see the same message 16 times. Therefore, it is desirable to limit the rank of the message output to the standard output. (Please refer to the program example above. (~/example/example.py))
Alternatively, you can use option --output-filename <directory name> of the mpirun command to save the output to a separate file for each rank. (See man mpirun for details)

In the quantum simulator system, the home directory is shared between compute nodes. If you write to a file without limiting its rank, multiple processes will simultaneously write to files with the same path. When writing files to the home directory, be sure to limit the rank.

Warning

Special attention should be paid to the determinism of the program and the writing of files to the home directory as described above.

6.3.1.3. Speed up by parallel processing

There are 48 CPU cores and 32 GB of RAM available per node . Quantum circuit simulation processing by mpiQulacs is performed in parallel using multiple nodes and multiple CPUs.

In order to speed up the entire quantum application, it is important to parallelize the non-quantum circuit simulation part (the part of the user written code that calls the API of mpiQulacs) appropriately to the extent possible so that the computational resources (CPU core) can be utilized as much as possible. Keep in mind that a lot of single-threaded or single-process processing outside of quantum circuit simulation can slow down the overall quantum application.