Generates synthetic genealogical populations at (small) country scale.
Running ValiPop with output_tables=true
will enable the validation phase of ValiPop. This will analyse the simulated target population to determine how similar it is to given input distributions. The result of the validation is written to the terminal output as the Validation score
. The lower the score, the more similar the population is to the given input distributions, which is desired. 0 is the best achievable score.
The following text shows sample terminal output from running ValiPop with Validation:
Running simulation with /app/src/test/resources/valipop/config/config-1.txt
Writing contingency tables
Writing records
2025/03/26 15:03:54.292 :: Generating birth records
Elapsed time: 00:00:00
2025/03/26 15:03:54.332 :: Generating death records
Elapsed time: 00:00:00
2025/03/26 15:03:54.367 :: Generating marriage records
Elapsed time: 00:00:00
Writing graph
Running validation with command: Rscript /app/results/test/2025-03-26T14-26-12-324/analysis.R /app/results/test/2025-03-26T14-26-12-324 50
Warning message:
In value[[3L]](cond) : Population size too small for partnering analysis
Validation score: 0.0 (good)
Notably, there may be warning messages, like in the terminal output above, that say the population is too small for some types of analysis. This means that some types of analysis may not be included in the validation score due to the lack of data to draw a meaningful conclusion. Generally population sizes of 10,000 and above are enough for all types of analysis.
All simulation results of running ValiPop are written to a single directory. The directory is saved in the following path structure
<results_save_location>/<run_purpose>/<datetime>/
results_save_location
and run_purpose
can be specified in the config file and datetime
represents the datetime when ValiPop was executed in the form yyyy-mm-ddThh-mm-ss-sss
.
ValiPop will create the directory structure for the results if it does not exist already.
The directory structure of the results of running ValiPop looks like the following:
<results_path>/<run_purpose>/<datetime>/
├───analysis.R
├───detailed-results-<datetime>.txt
│
├───dump/
│ └───order.csv
│
├───graphs/
│ └───graph.png
│
├───log/
│ └───trace.txt
│
├───records/
│ ├───birth_records.csv
│ ├───death_records.csv
│ └───marriage_records.csv
│
└───tables/
├───death-CT.csv
├───mb-CT.csv
├───ob-CT.csv
├───part-CT.csv
└───sep-CT.csv
analysis.R
This file is the analysis script executed to validate the simulated population with the given statistics.
detailed-results-<datetime>.txt
This file is generated once the model and analysis has completed. It provides additional statistics on the simulated model such as fertility and death rates, number of remarriages, population sizes, and average children per marriage.
dump/
The dump directory contains bulk information used for debugging.
graphs/
The graph directory contains any graphs generated once the model and analysis have completed. The type of graph generated can be specified in the configuration.
log/
The log directory contains files which gives more details about the model simulation than in standard output
records/
The records directory contains any records generated once the model and analysis have completed. The record format generated can be specified in the configuration. Generally only birth, death, and marriage records are recorded among the generated population.
tables/
The tables directory contains contingency tables on birth, death, partnership, and separation. They are used by the analysis to validate the simulated population with the given statistics.