Rigel: Difference between revisions

From Personal wiki
(add osm.py)
(→‎osm.py: add planned part)
Line 35: Line 35:


== osm.py ==
== osm.py ==
Python script that manages the [[Osm#renderd|rendering daemon]], pre-rendering of cached low-zoom tiles in [[osm]] lxc container and database data updates. The rendering daemon does keep a lot of data in RAM though and this manages to crash the server by memory starvation after about 12-24h of rendering, low-zoom tiles for caching for example. This script monitors available RAM and will kill <code>renderd</code> if it gets too low. Then it also resumes pre-rendering tiles for caching. At 00:30 it stops all rendering anyway and runs <code>/home/renderaccount/update.py</code>, see [[Osm#Data_upades]].
Python script that manages the [[Osm#renderd|rendering daemon]], pre-rendering of cached low-zoom tiles in [[osm]] lxc container and database data updates. The rendering daemon does keep a lot of data in RAM though and this manages to crash the server by memory starvation after about 12-24h of rendering, low-zoom tiles for caching for example. This script monitors available RAM and will kill <code>renderd</code> if it gets too low. Then it also resumes pre-rendering tiles for caching. At 00:30 it stops all rendering anyway and runs <code>/home/renderaccount/update.py</code>, see [[Osm#Data_upades]] and resumes rendering when the script exits. This is to allow most resources to be available to <code>osm2pgsql</code> during import.
 
PLANNED: maybe periodically, once every 2-3 days, also restart the database and make a zfs snapshot for backup, of an offline database to allow restoring. It is possible to make a snapshot but the database needs to be manually shutdown before:<syntaxhighlight lang="shell-session">
$ sudo /root/createsp.py ps1a dbp
</syntaxhighlight>This specific command, to make the next <code>sp[a-z][a-z]</code> snapshot for exactly the <code>ps1a/dbp</code> dataset, is allowed passwordless in <code>visudo</code>.

Revision as of 21:37, 13 April 2022

Server that runs all applications and manages all data. It does not run the router.

regular.py

Python script to coordinate recurring automation tasks. It reads a config file regular.csv and executes commands inside LXC containers at specified times and time intervals.

regular.csv

";"-delimited CSV file, header line is "id;runatstart;interval;start;depends;lxc_name;username;command". Each subsequent line specifies a command and how and when it has to be run. lxc_name specifies the LXC container under which to run, username the user within that container and command the exact command, which cannot have spaces in its arguments (true '1 2' 3 will evaluate to "true" "'1" "2'" "3"):

$ sudo lxc-attach -n {lxc_name} -- sudo -u {username} -- {command.split(' ')}

start is a 4-digit string in %H%M format at which time the command should be excecuted, and interval is a number of hours specifying how much hours after start the command should be run again, e.g. 24, 12 or 8 for one, two and three runs a day respectively. The depends column specifies whether the task is a parent task (empty depends) or a child task. Children tasks will be executed exactly after their parent, specified bydepends containing the id of the parent, finishes execution. The parent must be in a line above the child's, and the child's runatstart, interval and start values are ignored and can be empty. If multiple children depend on the same parent, they will be run in the order they are listed in the .csv file from top to bottom. The runatstart column is a boolean (only checked for ==1) specifying whether the parent task should also be run at program startup if 1 and else the start+n*interval time will be awaited.

Contents:

sponsorblock

rsyncs the 1GB-sized spnosorTimes.csv through perun lxc container. Then runs the python script /home/user/addcsv.py to read the .csv file and load it into the ram-based database. To read spnosorTimes.csv, uses the C program /var/lib/postgresq/mkcsv, source at /var/lib/postgresq/mkcsv.c. mkcsv does most preprocessing by outputting the .csv while eliminating unused columns. A benchmark shows it reduces 1GB by about 5x and in terms of speed reaches 20MB/s :

$ /var/lib/postgresql/mkcsv|dd status=progress of=/dev/null
181457408 bytes (181 MB, 173 MiB) copied, 11 s, 16.5 MB/s
410668+1 records in
410668+1 records out
210262359 bytes (210 MB, 201 MiB) copied, 11.9192 s, 17.6 MB/s

repo

rsyncs from archlinux and artix servers every 12 hours. See repository for using it.

lectures

Uses feed2exec python module to fetch RSS feeds of ETHZ video lectures and extract links to the video files and their upload dates. Then download links with wget into files named by date, e.g. "%Y-%m-%d.mp4". All specified in /home/user/updatefeeds.sh and run in perun lxc container.

git

Daily synchronization of select repositories for source code. See /mnt/software/git_clone.sh in perun lxc container.

stats

Hourly synchronization through ssh from nyx. Extracts data as json with:

ssh nyx vnstat -i usb0 --json f

The .json file is then modified though, see prepending to make it Javascript.

osm.py

Python script that manages the rendering daemon, pre-rendering of cached low-zoom tiles in osm lxc container and database data updates. The rendering daemon does keep a lot of data in RAM though and this manages to crash the server by memory starvation after about 12-24h of rendering, low-zoom tiles for caching for example. This script monitors available RAM and will kill renderd if it gets too low. Then it also resumes pre-rendering tiles for caching. At 00:30 it stops all rendering anyway and runs /home/renderaccount/update.py, see Osm#Data_upades and resumes rendering when the script exits. This is to allow most resources to be available to osm2pgsql during import.

PLANNED: maybe periodically, once every 2-3 days, also restart the database and make a zfs snapshot for backup, of an offline database to allow restoring. It is possible to make a snapshot but the database needs to be manually shutdown before:

$ sudo /root/createsp.py ps1a dbp

This specific command, to make the next sp[a-z][a-z] snapshot for exactly the ps1a/dbp dataset, is allowed passwordless in visudo.