Rigel: Difference between revisions

From Personal wiki
(Add regular.py contents)
(→‎stats: adjust interpage link)
Line 32: Line 32:
Hourly synchronization through <code>ssh</code> from [[nyx]]. Extracts data as json with :<syntaxhighlight lang="shell">
Hourly synchronization through <code>ssh</code> from [[nyx]]. Extracts data as json with :<syntaxhighlight lang="shell">
ssh nyx vnstat -i usb0 --json f
ssh nyx vnstat -i usb0 --json f
</syntaxhighlight>The <code>.json</code> file is then modified though, prepended with "<code>alldata=</code>" so that it is valid javascript and can be simply used in HTML :<syntaxhighlight lang="html">
</syntaxhighlight>The <code>.json</code> file is then modified though, see [[Lada#Stats|prepending to make it Javascript]].
<script src="/stats.json"></script>
</syntaxhighlight>

Revision as of 20:50, 13 April 2022

Server that runs all applications and manages all data. It does not run the router.

regular.py

Python script to coordinate recurring automation tasks. It reads a config file regular.csv and executes commands inside LXC containers at specified times and time intervals.

regular.csv

";"-delimited CSV file, header line is "id;runatstart;interval;start;depends;lxc_name;username;command". Each subsequent line specifies a command and how and when it has to be run. lxc_name specifies the LXC container under which to run, username the user within that container and command the exact command, which cannot have spaces in its arguments (true '1 2' 3 will evaluate to "true" "'1" "2'" "3"):

$ sudo lxc-attach -n {lxc_name} -- sudo -u {username} -- {command.split(' ')}

start is a 4-digit string in %H%M format at which time the command should be excecuted, and interval is a number of hours specifying how much hours after start the command should be run again, e.g. 24, 12 or 8 for one, two and three runs a day respectively. The depends column specifies whether the task is a parent task (empty depends) or a child task. Children tasks will be executed exactly after their parent, specified bydepends containing the id of the parent, finishes execution. The parent must be in a line above the child's, and the child's runatstart, interval and start values are ignored and can be empty. If multiple children depend on the same parent, they will be run in the order they are listed in the .csv file from top to bottom. The runatstart column is a boolean (only checked for ==1) specifying whether the parent task should also be run at program startup if 1 and else the start+n*interval time will be awaited.

Contents:

sponsorblock

rsyncs the 1GB-sized spnosorTimes.csv through perun lxc container. Then runs the python script /home/user/addcsv.py to read the .csv file and load it into the ram-based database. To read spnosorTimes.csv, uses the C program /var/lib/postgresq/mkcsv, source at /var/lib/postgresq/mkcsv.c. mkcsv does most preprocessing by outputting the .csv while eliminating unused columns. A benchmark shows it reduces 1GB by about 5x and in terms of speed reaches 20MB/s :

$ /var/lib/postgresql/mkcsv|dd status=progress of=/dev/null
181457408 bytes (181 MB, 173 MiB) copied, 11 s, 16.5 MB/s
410668+1 records in
410668+1 records out
210262359 bytes (210 MB, 201 MiB) copied, 11.9192 s, 17.6 MB/s

repo

rsyncs from archlinux and artix servers every 12 hours. See repository for using it.

lectures

Uses feed2exec python module to fetch RSS feeds of ETHZ video lectures and extract links to the video files and their upload dates. Then download links with wget into files named by date, e.g. "%Y-%m-%d.mp4". All specified in /home/user/updatefeeds.sh and run in perun lxc container.

git

Daily synchronization of select repositories for source code. See /mnt/software/git_clone.sh in perun lxc container.

stats

Hourly synchronization through ssh from nyx. Extracts data as json with :

ssh nyx vnstat -i usb0 --json f

The .json file is then modified though, see prepending to make it Javascript.