I do monitor local meteorological data in Nagycenk Observatory since 2011 using a La Crosse WS2350 weather station. Montoring the weather doesn't belong to the Observatory's primary interests, but it's proven to be useful as a supplementary information in many cases. Originally, my WS2350 was connected to an ordinary PC equipped with the open source wview monitoring software. Wview is a nice piece of software which provides a complete solution for communicating with the WS2350, but I wanted to have something which is more lightweight and easier to customize, so later on I ended up using open2300 instead. Also, using a desktop PC is an overkill too, so I was planning to replace it with a Raspberry Pi ever since I purchased my first model B board. Due to lack of time, I had the chance to perform this step only in the past few days. Here is a description about how I made it work.
The main idea was to install a minimal, console-only Raspbian OS on the SD card of the Pi and make it read the weather from a cronjob derived from open2300 scheduled to run at every 5 minutes. All instructions written below are tested under the "first generation" model B Raspberry I referred to before.
Preparing a minimal SD card
This step turned out to be fairly easy as I came across a nicely prepared installer which fitted my purposes very well. The version of the installer I used was 1.0.7, the most recent at the time of installation. Basically I didn't do anything but followed the instructions provided in the README file. It gave me just the tools I need: NTP, CRON and OpenSSH (which I replaced with dropbear). The default setting for the timezone was UTC which is also what I wanted so I left this setting untouched as well (otherwise dpkg-reconfigure tzdata would work). I changed however the root password and the hostname. This latter step can be accomplished via editing /etc/hostname and can be skipped of course but I wanted to have something more meaningful than the default name pi.
Then I installed the following softwares:
apt-get install subversion make gcc patch rsync
- Subversion is needed because open2300 is stored in an SVN repo. It can be removed after checking out the code.
- Make is needed in order to be able to compile open2300.
- GCC is needed because open2300 is written in C. Both GCC and make can be removed after compiling open2300.
- Patch is needed for convenience: I use a slightly modified version of the log2300 tool and it's easier to reproduce the modification via providing a patch. It can be removed after patching the log2300 file.
- rsync is needed in order to provide automated remote access to the data we collect with the Pi. For this reasons it should stay on the system parmanently. It could be replaced with anything else which provides similar functionality though.
The only remaining step on the system administration side was to create a separate user with less rights than su -- let's name it logger.The data logging software and the scheduled task are run as this user. In order to make communication with the WS2350 possible, it is essential to add the newly created user to the dialout group.
usermod -a -G dialout logger
After performing all these steps, the resulting system is still under 500MB.
Reading data from the WS2350 with the Pi
From now on -- if not otherwise stated -- I assume that we are logged in as user logger. The first step is to patch, compile and "install" open2300.
svn co http://www.lavrsen.dk/svn/open2300/trunk open2300
patch log2300.c < /tmp/log2300.patch
mv log2300 ~
The subversion repository used above is at revision 13 at the time of posting. We have to create a configuration file before we can make a test run. Issue
cp open2300-dist.conf /tmp/open2300.conf
then open /tmp/open2300.conf for editing (by default, only vi is available). The only things we need to modify is setting the SERIAL_DEVICE configuration option to /dev/ttyUSB0 and the TIMEZONE configuration option to 0 (if I'm not mistaken log2300 ignores this latter one, but let's stay on the safe side). The other configuration options are either ignored by log2300 or are connected to measurement units -- in the latter case I'm fine with the defaults. To sum it up, here is a list of configuration options that are relevant from our point of view:
Provided that the WS2350 is turned on and is connected to one of the USB slots of the Pi via the RS232 to USB converter that was shipped with it (see picture above), we can now make a test run by issuing
/home/logger/log2300 /tmp /tmp/open2300.conf
If everything goes well a new file with the extension .ws2350 is created in folder /tmp with a content similar to
20150824175520 27.7 24.6 14.7 48 54 0.0 67.5 ENE 24.6 0.00 0.00 1634.77 1002.500 Steady Cloudy
The columns are timestamp, temperature (in), temperature (out), dewpoint, relative humidity (in), relative humidity (out), wind direction (degree), wind direction (textual representation), windchill, rain (1h), rain (24h), rain (total), relative pressure, tendency and forecast, respectively.
It's worth a mention that the timestamp column is coming from the system time, which is synchronized to remote servers by NTP. In theory it is possible that the Pi loses the network connection to the outside world, therefore I would like to install a -- possibly also Raspberry driven -- stratum 1 NTP server within the same local network where the Raspberry belongs to. Another advantage of the Raspbian installer I used is that it comes with fake-hwclock installed, which prevents the system time jumping back to 1970. It is very useful considering that the Raspberry doesn't have a hardware clock. It would be also possible to read a timestamp from the WS2350, but for some reasons mine always fails to catch th DCF77 signal so I rejected this option.
Some people reported to have problems with getting the discontinued FT232BM serial to USB converter chip to work under Raspberry, but my Raspbian installation seems to handle it out-of-the-box.
Continuous data logging
I placed the configuration file we've created in the previous section into /usr/local/etc (of course this operation requires root privileges):
cp /tmp/open2300.conf /usr/local/etc
This way log2300 loads it automatically which eliminates the need of having to specify it as an argument. This means that the 3 commands below are equivalent to each other.
/home/logger/log2300 /tmp /tmp/open2300.conf
/home/logger/log2300 /tmp /usr/local/etc/open2300.conf
Of course from now on I will use the shortest, i.e. the last form. Other valid location for the configuration file would be the folder /etc. Considering the fact that only a few configuration options are used -- and even those few can be considered pretty much constant in my use case --, I'd like to note here that it would be probably slightly more efficient to compile those directly into the executable.
The final steps are to create a folder to where we want to save our data files
and schedule a data readout to run every 5 minutes by adding the following line to our logger user's crontab via issuing crontab -e:
*/5 * * * * /home/logger/log2300 /home/logger/data
After this setting takes into effect a new file described in the previous section is created in /home/logger/data every 5 minutes. Such a file has the extension ws2350 and is named after the current timestamp. After a few minutes the content of the data folder will look something like
So the main difference between the original and the patched version of log2300.c is that while the original version appends a new line into the same file for each data readout, the patched version creates a new file for each data readout.
At this point we basically achieved our goal, we have the desired continuous monitoring. It's still not the end of the post though, becuase we haven't answered
"What prevents the content of the data folder to grow infinitely and how do we access archive data?"
The next section is dedicated to this question.
Sorting the data using a second computer
In Nagycenk Observatory the rule of thumb we would like to follow is that PCs carrying out data logging tasks should be kept as simple as possible and should not run more services than what is really neccessary. Additionally, they should not be directly visible to the "outside world". Fortunately we have dedicated data servers which could continuously download the data stored on the data logging PCs and take over any other tasks related to data processing. These data servers are responsible for handling more data sources and are directly visible to the outside world via some protocols. In the remaining paragraphs I describe what setup is required on one of our data servers to further process the data collected by our Raspberry-based data logger.
At first we create a new user (as root):
Then we switch to this user and create some folders in its home directory:
mkdir data data/cache data/recent
Then we download the bash script which is responsible for data download:
chmod +x pipe.sh
We create a configuration file for the script, let's call it conf.sh. Its executable bit needs to be set as well and it needs to contain the following lines
HOST=10.0.0.10 # ip address of the rpi logger
When this script is invoked it performs the following tasks:
Refer to the content of pipe.sh for more details.
- Downloads and removes all data from the Raspberry logger's data folder (if there is any) into $RECENTDIR.
- For all files in $RECENTDIR it performs the following:
- Based on its timestamp it appends the content of the file to an appropriate plain text file in $DESTDIR. The script opens a new plain text file for each day and creates a new subdirectory for each year.
- Moves the file to $CACHEDIR. This serves as a temporary backup.
- Updates the symbolic link pointing to the most recent data file (if needed).
I call this script every 2 minutes using the following crontab entry:
*/2 * * * * flock -n /home/ws/pipe.lock -c '/home/ws/pipe.sh /home/ws/conf.sh' >> /home/ws/pipe.log 2>&1
The use of flock ensures that no simultaneous instances of the script could be called, i.e. that if one job doesn't finishes within 2 minutes then it won't be called again 2 minutes later. Note that the output of the script is redirected to a log file called pipe.log. Currently it has to be deleted manually once in a while in order to prevent it from getting too large. I use a second crontab entry for deleting files older than 4 weeks from the cache directory:
0 * * * /usr/bin/find /home/ws/data/cache/* -maxdepth 0 -type f -mtime +28 -delete
After all these steps the most recent data can be checked f.i. via
tail -f data/wsrecent.dat
There are comments.