Newer
Older
# In-Sylva Information System
## Table of content
- [In-Sylva Information System](#in-sylva-information-system)
* [Table of content](#table-of-content)
* [Short architecture description](#short-architecture-description)
* [Requirements](#requirements)
* [Getting source code](#getting-source-code)
* [Build project](#build-project)
+ [For development](#for-development)
+ [For production](#for-production)
+ [Run project](#run-project)
* [Keycloak configuration](#keycloak-configuration)
* [Admin user creation](#admin-user-creation)
* [Upload in-sylva standard](#upload-in-sylva-standard)
* [Application access](#application-access)
+ [Portal tool](#portal-tool)
+ [Search tool](#search-tool)
* [Data dump and restore](#data-dump-and-restore)
* [Health check for elasticsearch](#health-check-for-elasticsearch)
* [Attention](#attention)
* In-Sylva Information System relies on docker.
* It is built on a microservices' scheme.
* Each microservice runs independently in its own docker container.
* You will find information about each microservice in their respective repositories' `README.md`
BISSON REMI
committed
* You will find schematics explaining the project's architecture, the databases in the `./documentation` folder.
* In-Sylva Information System access is managed with keycloak authentication.
* In-Sylva Information System has been successfully tested on debian (9 and 10) hosts.
To download this repository, you can use the following command:
git clone https://forgemia.inra.fr/in-sylva-development/in-sylva.information-system.git
You will find other microservice's repositories in the [in-sylva development GitLab group](https://forgemia.inra.fr/in-sylva-development).
BISSON REMI
committed
Use `git clone` to download each project's source code if you want to see or modify it.
## Build project
Execute this command to build docker images for development:
```sh
./build.sh -k id_rsa -e dev
```
### For production
Execute this command to build docker images for production:
```sh
./build.sh -k id_rsa -e prod -d <url> -ip <IP_address> -p <port>
```
where you set
* `<url>` as access URL (e.g., http://www.mydomain.world/insylva/)
* `<IP_address>` as the IP address of the server on which in-sylva applications are running
* `<port>` as the port number of the server on which in-sylva applications are running
## SSL Certificates
### Production and pre-production
To handle SSL certificates, the reverse-proxy (nginx) service uses the certbot tool to generate and renew the certificates on production servers.
Certificates are stored in the ssl_certificates/pem directory.
The certificates are renewed on production and preproduction every day at 2 A.M. using a cron job. The cron job is defined in the crons/certificate_auto_renewer_installer file.
On deployment, the playbook .ansible/playbook.yml will execute the crons/certificate_auto_renewer_installer script to install the cron job.
### Local development
For local development, the reverse-proxy service uses self-signed certificates. These certificates must be stored in the ssl_certificates/pem directory.
To generate them, you can use the following command :
```docker compose -f docker-compose.certs.yml up install-certs-dev```
It will generate the certificates and store them in the ssl_certificates/pem directory.
## Run project
The first time `start_in-sylva.sh` is executed a `.env` file is created.
The script will exit inviting you to edit this file with your own values.
BISSON REMI
committed
This step is mandatory as it contains necessary configuration for each microservice.
The `.env` file contains explanation for each value so take time
to understand otherwise the project will not work properly.
BISSON REMI
committed
> ⚠️ Project will need to re-built after editing environment variables.
So the first time you want to run this project, you should:
1) Execute `./start-in-sylva.sh` Note: if not executable, run `chmod +x start-in-sylva.sh`
2) Edit `.env` configuration file
BISSON REMI
committed
3) [Build](#build-project) project
4) Follow [instructions bellow](#keycloak-configuration)
After that, you will need to run `./start-in-sylva.sh` to start the project.
At this point, all microservices' containers should be running, but not fully functional yet.
## Keycloak configuration
BISSON REMI
committed
* Go to [pgAdmin](http://localhost:5050/) and log-in using credentials from `.env` file (`PGADMIN_DEFAULT_EMAIL` and `PGADMIN_DEFAULT_PASSWORD`)
* Create access to postgres server:
* Click on `Add New Server`
* Add a name in the `Name` field (e.g., `insylva`)
* In `Connection` tab, add the postgres container's IP address in `Host name/address`. Two ways to find it:
* Go to [portainer containers' list](http://localhost:9000/#!/1/docker/containers), then find `in-sylva.postgres` row and `IP Address` column
* Or using `ip -a address` as root on the host container
BISSON REMI
committed
* In `Username` and `Password` fields, add the corresponding credentials from `.env` file (`POSTGRES_USER` and `POSTGRES_PASSWORD`)
* Click `Save`
* Then open a query tab on the keycloak database (public schema) and execute this SQL query:
```sql
update REALM set ssl_required = 'NONE' where id = 'master';
```
* Restart the keycloak container using [portainer](http://localhost:9000)
BISSON REMI
committed
* Connect to [keycloak](http://localhost:7000/keycloak/auth/) using credentials from `.env` file (`KEYCLOAK_USER` and `KEYCLOAK_PASSWORD`)
* On page's top-left corner, click on `Master` and select `Add Realm` button and import `realm-export.json` file located in `./keycloak/` subfolder.
## Admin user creation
Create an admin user for the system. This step is mandatory to access the portal.
* In a terminal, execute `curl --location --request POST 'http://localhost:4000/user/create-system-user'`
* Restart the login container using [portainer](http://localhost:9000)
## Upload in-sylva standard
* [Connect to the portal](http://localhost:3000/) using credentials given in `.env` (`IN_SYLVA_ADMIN_USERNAME` and `IN_SYLVA_ADMIN_PASSWORD`)
* In the `Fields` tab you can upload a standard in csv format. Note: a version of this file can be found [here](https://data.inrae.fr/dataset.xhtml?persistentId=doi:10.15454/ELXRGY).
## Application access
### Portal tool
The portal is accessible:

Philippe Clastre
committed
* at `http://localhost:3000/portal` for development environment
* at the URL set as build parameter for production (e.g. `http://www.mydomain.world/si/portal`)
* Access in-sylva microservices tools: Portainer, PgAdmin, Kibana, mongo-express, Elasticsearch, Keycloak
* Manage in-sylva administration (users' accounts, roles and groups, sources, policies)
* Upload metadata records to the system
### Search tool
The search tool is accessible:
* at `http://localhost:3001/search` for development environment
* with the URL set as build parameter for production (e.g. `http://www.mydomain.world/si/search`)
This application allows you to:
* Search for metadata records in the catalog (basic and advanced search)
* Export metadata records after a specific search
## Data dump and restore
Scripts used to dump and restore data are provided in `dump_restore_tools` directory.
According to your own backup policy,
you can use insylva_bdds_dump_all.sh to dump all data from microservices of the SI
(postgres, mongodb, and elasticsearch).
The result of the dump procedure is an `archive.tar` file stored in the dump_restore_tools directory.

Philippe Clastre
committed
On the hosting machine, you can install a cron job, running each days, the script insylva_bdds_dump_all.sh.
The crontab main contains several lines, and you MUST adapt these with the full path to your in-sylva SI installation:
```
* # for montly dump
* 0 0 1 * * bash -c 'cd dump_restore_tools && ./insylva_bdds_dump_all.sh'
* # on each friday, generate weekly dump and replaced each week
* 10 0 * * 5 bash -c 'cd dump_restore_tools && ./insylva_bdds_dump_all.sh'
* # on each day, generate daily dump
* 30 0 * * 1-4 bash -c 'cd dump_restore_tools && ./insylva_bdds_dump_all.sh'
* # (optional) synchronise dump storage to an s3 ressource (see below for confiuration)
* 30 1 * * 1-5 bash -c 'cd dump_restore_tools && ./send_dumps_to_s3.sh'
```
### S3 configuration file
For the last point (synchronising dump archives in a S3 storage), you have to create a file ***s3config_file*** available in dump_restore_tools directory.
This file should be generated with the command:
- ```s3cmd --configure -c dump_restore_tools/s3config_file```
If you decide to activate this command, you will have, on your s3 ressource, exactly the same dump files as in the dump_restore_tools/DUMPS directory
The script insylva_bdds_restore_all.sh can be used to restore an archive.
To properly restore data, you have to start from a new installation.
For this, you must re-do all the above installation and setup procedures.
Then run the restore script and follow instructions given at the end to restart microservices' containers.
## Health check for elasticsearch
When re-booting, search-api container needs to be restarted after elasticsearch container is fully started.
This is done automatically using a script executed after reboot using crontab.
To set this up on a new host, add the following line to your crontab:
```
@reboot /usr/local/insylva/in-sylva.information-system/tools/restart_search_api.sh
```
If you encounter a problem with the search tool (e.g., results are empty), you can also manually run this script.
Unless needed, you do not need to generate Certificate Authority (pem file with openssl) nor edit `docker-compose.yml`.
If you want to change those files and settings, please read the below instructions carefully.
For production workloads, make sure the host setting `vm.max_map_count` is set to at least 262144.
On the Open Distro for Elasticsearch Docker image, this setting is the default.
To check this, start a Bash session in the container and run: `cat /proc/sys/vm/max_map_count`
To increase this value, you have to modify the host operating system.
On the RPM installation, you can add the following line at the end of the host machine `/etc/sysctl.conf` file:
```
vm.max_map_count=262144
```
This value is controlled when you run build.sh script.
A warning message will be displayed in case of vm.max_map_count incompatible with Open Distro Elasticsearch Docker image.
The docker-compose.yml file also contains several key settings:
`bootstrap.memory_lock=true, ES_JAVA_OPTS=-Xms512m -Xmx512m`, nofile 65536 and port 9600.
These settings respectively:
* Disable memory swapping (along with memlock)
* Set the size of the Java heap (we recommend half of system RAM)
* Set a limit of 65536 open files for the Elasticsearch user and allow you to access Performance Analyzer on port 9600