[Rack] door still is not working - too many baron.py instances!

Jake jake at spaz.org
Tue Jul 30 23:56:28 UTC 2013


this afternoon (tuesday 4:30pm) the payphone was unresponsive.  I logged 
into minotaur and was greeted with

Last login: Mon Jul 29 22:52:24 2013 from 172.30.1.30
-bash: fork: Cannot allocate memory
jake at minotaur:~$

any command i tried failed with the same error, so i told it to reboot.

after the reboot, i see that there are four instances of baron.py - two 
for upstairs and two for downstairs.  that is a problem.

help!


On Tue, 30 Jul 2013, Jonathan Lassoff wrote:

> OK -- I think I've kicked it into shape. pidfiles for baron are now created by the baron process and are in /var/run/baron.
> Hopefully that should fix things a bit.
> 
> Happy hacking,
> jof
> 
> 
> On Tue, Jul 30, 2013 at 12:37 AM, Jonathan Lassoff <jof at thejof.com> wrote:
>       Hrm... looking.
> There should be two baron instances (upstairs and down), but it looks like there are a shitload of downstairs running. The /var/log/monit.log tells the story:
> 
> [PDT Jul 30 00:24:53] error    : Alert handler failed, retry scheduled for next cycle
> [PDT Jul 30 00:24:53] error    : 'baron_downstairs' process is not running
> [PDT Jul 30 00:24:53] info     : 'baron_downstairs' trying to restart
> [PDT Jul 30 00:24:53] info     : 'baron_downstairs' start: /bin/su
> [PDT Jul 30 00:25:24] error    : 'baron_downstairs' failed to start
> 
> 
> Looks like monit is targeting the process to monitor by the pidfile, however it looks like the downstairs instance isn't creating a pidfile.
> It should be writing to /var/run/baron_downstairs.pid, but the file doesn't exist, and /var/run is only writable by root, so I would guess that it's just failing to create that file (the daemon
> runs as "baron").
> 
> I'm going to create a /var/run/baron directory for pidfiles to land, but another cromulent fix would be to use start-stop-daemon as root and have that create the pidfile as it forks the
> "baron"-owned process.
> 
> Update in a minute....
> 
> 
> On Tue, Jul 30, 2013 at 12:25 AM, Jake <jake at spaz.org> wrote:
>       but it only works upstairs, not on the payphone.  i'm sure there are multiple baron.py running on it is why, why does that happen?
>
>       i'm on bart i cant log into minotaur
>
>       On Tue, 30 Jul 2013, Jonathan Lassoff wrote:
>
>       Fuck yeah! Phew.Thanks for bringing this to my attention.
> 
>
>       On Tue, Jul 30, 2013 at 12:06 AM, Jake <jake at spaz.org> wrote:
>             yay it fucking works!
>
>             On Tue, 30 Jul 2013, Jonathan Lassoff wrote:
>
>                   I unloaded and re-probed the device (rebooting the controller on
>                   it), and now it seems to be talking again.Weirdness!
> 
>
>                   On Mon, Jul 29, 2013 at 11:52 PM, Jonathan Lassoff <jof at thejof.com>
>                   wrote:
>                         I fixed it, but now any parallel port-related ioctls are not
>                   working on the device.
>                   Something is really weird with that USB Parallel port adapter right
>                   now. Could we try moving it to another port, making sure the wires
>                   aren't touching anything weird, or try
>                   unplugging/replugging it?
>                    
>                   It'd be really cool if we could make some proper interface hardware
>                   sometime.
>                   What do you think about making a little arduino with some
>                   optoisolators that we can use to talk serial to?
>
>                   --j
> 
>
>                   On Mon, Jul 29, 2013 at 11:49 PM, Jonathan Lassoff <jof at thejof.com>
>                   wrote:
>                         So, I can see that gateman is failing to start because
>                   /dev/parport0 is failing to get created and owned/writable by the
>                   "lp" group. This is how gateman accesses this.
>                   That device seems to be getting repeatedly re-probed or something.
>
>                   Manually fixing up...
> 
>
>                   On Mon, Jul 29, 2013 at 11:48 PM, Jake <jake at spaz.org> wrote:
>                         i and sharp are here too, sharp is much sharper than i am
>                   though and he is digging down to the root of the problem.
>
>                         but we have to leave on the last bart.
>
>                         On Mon, 29 Jul 2013, Jonathan Lassoff wrote:
>
>                               Eep! I'm bummed that this is broken.
>                               I'm poking around on minotaur to try and see what's up.
>                   Those errors from api are because it couldn't check the gate ringing
>                   status from gateman.
>
>                               --j
> 
>
>                               On Mon, Jul 29, 2013 at 11:42 PM, Jake <jake at spaz.org>
>                   wrote:
>                                     help!!!!  it's not working!!!
>
>                                     me and sharp are trying to fix it but.. we need
>                   you jof
>
>                                     this gateman shit is not working at all.  we tried
>                   the mknod thing and...
>
>                                     help!  the door can't be opened!
>
>                                     -jake
>                                     415-533-3699
>                                     (leaving by midnight)
>
>                                     On Sat, 6 Jul 2013, Jonathan Lassoff wrote:
>
>                                           Ok, I did some security, packaging, and
>                   general FIXME work on gateman,
>                                           and have it running now.
>
>                                           We may need to run this if the box reboots.
>                   I'l still not sure about
>                                           fixing udev the right way.
>
>                                            sudo mknod /dev/parport0 c 99 0
>                                            sudo service gateman status || sudo service
>                   gateman start
>
>                                           On Sat, Jul 6, 2013 at 3:56 PM, Jonathan
>                   Lassoff <jof at thejof.com> wrote:
>                                                 So, I figured out that /dev/usb/lp0 is
>                   actually a line printer emulation device.
>                                                 /dev/parport0 was still what I wanted
>                   (99, 0), but udev wasn't creating it.
>
>                                                 I manually did a mknod for it, but now
>                   just need to figure out how to
>                                                 get it to survive reboots.
>
>                                                 --j
>
>                                                 On Sat, Jul 6, 2013 at 2:12 PM,
>                   Jonathan Lassoff <jof at thejof.com> wrote:
>                                                       On Sat, Jul 6, 2013 at 3:17 AM,
>                   Jake <jake at spaz.org> wrote:
>                                                             we have been spamming the
>                   discuss list with this stuff.  no more.
> 
>
>                                                       Oh... yeah, I suppose rack@ is
>                   more appropriate.
>                                                       However, I think we're more
>                   on-topic in this thread than most of the
>                                                       other threads :p
>
>                                                             it may be an issue of the
>                   old hardware having a "real" parallel port that we
>                                                             never used, because we
>                   didn't have a connector for it, and the new hardware
>                                                             doesn't have a real
>                   parallel port.
> 
>
>                                                       Certainly possible.
>                                                       However, since it's the USB
>                   emulator in both cases. I'm unsure exactly
>                                                       what is breaking.
>
>                                                             make sure you're talking
>                   to the right device, which is a USB parallel port
>                                                             which is tied to the
>                   sprinkler pipe just above the wall-o-tubes.
>
>                                                             i guess you're looking at
>                   /dev/usb... so it should be the right device.
>
>                                                             it must be something about
>                   the new OS install, different drivers for the usb
>                                                             parallel port or something
>                   fucked up like that.  good luck.
> 
>
>                                                       I think I'm talking to the right
>                   device. Both kernels used the uss720
>                                                       driver, however I was getting at
>                   it as /dev/parport0 (hard-coded into
>                                                       gateman :( ), whereas now there
>                   seems to be a device at /dev/usb/lp0.
>
>                                                       It's major number 180, minor 0,
>                   which the device listing
>                                                      
>                   (http://www.mjmwired.net/kernel/Documentation/devices.txt#2531)
>                   shows
>                                                       as 0 = /dev/usb/lp0 First USB
>                   printer.
>
>                                                       I wonder if I'm trying to
>                   bit-bang to something that is supposed to be
>                                                       a printer, and that is getting
>                   in the way.
>
>                                                       I'll keep hacking and
>                   experimenting. Any kernel hackers looking for a
>                                                       short exploration are encouraged
>                   to get in touch.
>
>                                                       Cheers,
>                                                       jof
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>


More information about the Rack mailing list