Update monitoring under Debian Linux

I’ve recently been migrating some of my servers at work from SuSE Linux over to Debian based distributions.  We can’t do this in every instance because enterprise vendor support for Debian doesn’t exist for every product we use.  We like migrating over to Debian for the following reasons:

  • Security patches are freely available and frequent.  One complaint I’ve had with Novell is the fact that they charge a yearly fee for product updates.  Many of those updates are the same updates that are free to everyone not using Novell’s products.
  • Debian, for a community based distribution, has a very slow turn over rate.  As a sysadmin, we already spend enough of our time trying to keep up with just the product and security patches that are necessary for the day to day usage of the systems, so we don’t want to invest additional time in upgrading base operating systems if we can avoid it.

In addition to the above reasons, I’ve also discovered that it’s pretty easy to get Debian to automatically inform me when I need to download/install updates.  For those of you who are running X windows/KDE/Gnome, you already know that there are desktop utilities that will do the same thing.  However, in a server environment,  you don’t necessarily have the luxury of running a full GUI desktop, so command line solutions are the only tools available.

This brings me to the point of my post.  I wanted to have a cron job give me a report of what updates and security patches were available to be installed on my system.  I further wanted the server to give me this information in an email message so that I don’t have to login to the server on a regular basis just to find out if updates are available (this is what I like to refer to as babysitting the server and I just don’t have time nor patience for it).  As it turns out, this is fairly easy to do with a bit of python code, but the details aren’t widely known.

In Ubuntu based distributions when you open an ssh connection to the server, it will happily display the following:

15 packages can be updated.
18 updates are security updates.

What’s curious here is that the system is obviously running some program in the background to update /etc/motd with the information I want to have in my report, but further searches online failed to identify what mechanism is responsible for the update.  After digging around on the web for an hour trying to figure this out, I eventually downloaded the source code for the update-notifier package which is the code base for the graphical desktop widget that prompts the user to install updates.

The source code for that package allowed me to eventually figure out that /usr/lib/update-notifier/apt-check was being run in the background.  This little python script spits out a very simple pair of numbers.  The first number is the total number of packages that can be upgraded and the second number is the number of security patches available.  Further reading of the script showed that it took another command  line option which would cause it to list each of the packages that needed to be upgraded.

Hmm.  I’ve never really written python before, but how hard could it be?  So after some fiddling with the script, I ended up with the following modified version of apt-check:

#!/usr/bin/python

#nice apt-get -s -o Debug::NoLocking=true upgrade | grep ^Inst

import apt_pkg
import os
import sys
from optparse import OptionParser
from gettext import gettext as _
import gettext
SYNAPTIC_PINFILE = "/var/lib/synaptic/preferences"

def clean(cache,depcache):
    # mvo: looping is too inefficient with the new auto-mark code
    #for pkg in cache.Packages:
    #    depcache.MarkKeep(pkg)
    depcache.Init()

def saveDistUpgrade(cache,depcache):
    """ this functions mimics a upgrade but will never remove anything """
    depcache.Upgrade(True)
    if depcache.DelCount > 0:
        clean(cache,depcache)
    depcache.Upgrade()

def _handleException(type, value, tb):
    sys.stderr.write("E: "+ "Unkown Error: '%s' (%s)" % (type,value))
    sys.exit(-1)

# -------------------- main ---------------------

# be nice
os.nice(19)

# setup a exception handler to make sure that uncaught stuff goes
# to the notifier
sys.excepthook = _handleException

# gettext
APP="update-notifier"
DIR="/usr/share/locale"
gettext.bindtextdomain(APP, DIR)
gettext.textdomain(APP)

# check arguments
parser = OptionParser()
parser.add_option("-p",
                  "--package-names",
                  action="store_true",
                  dest="show_package_names",
                  help="show the packages that are going to be installed/upgraded")
parser.add_option("-r",
                  "--report",
                  action="store_true",
                  dest="create_report",
                  help="Report on packages that need upgrading and summarize results")
(options, args) = parser.parse_args()
#print options.security_only

# init
apt_pkg.init()

# get caches
try:
    cache = apt_pkg.GetCache()
except SystemError, e:
    sys.stderr.write("E: "+ _("Error: Opening the cache (%s)") % e)
    sys.exit(-1)
depcache = apt_pkg.GetDepCache(cache)

# read the pin files
depcache.ReadPinFile()
# read the synaptic pins too
if os.path.exists(SYNAPTIC_PINFILE):
    depcache.ReadPinFile(SYNAPTIC_PINFILE)

# init the depcache
depcache.Init()

if depcache.BrokenCount > 0:
    sys.stderr.write("E: "+ _("Error: BrokenCount > 0"))
    sys.exit(-1)

# do the upgrade (not dist-upgrade!)
try:
    saveDistUpgrade(cache,depcache)
except SystemError, e:
    sys.stderr.write("E: "+ _("Error: Marking the upgrade (%s)") % e)
    sys.exit(-1)

# check for upgrade packages, we need to do it this way
# because of ubuntu #7907
upgrades = 0
security_updates = 0
for pkg in cache.Packages:
    if depcache.MarkedInstall(pkg) or depcache.MarkedUpgrade(pkg):
        # check if this is really a upgrade or a false positive
        # (workaround for ubuntu #7907)
        if depcache.GetCandidateVer(pkg) != pkg.CurrentVer:
                upgrades = upgrades + 1
                ver = depcache.GetCandidateVer(pkg)
                for (file, index) in ver.FileList:
                    if (file.Archive.endswith("-security") and
                        file.Origin == "Ubuntu"):
                        security_updates += 1

# print the number of upgrades
if options.show_package_names:
    pkgs = filter(lambda pkg: depcache.MarkedInstall(pkg) or depcache.MarkedUpgrade(pkg), cache.Packages)
    sys.stderr.write("\n".join(map(lambda p: p.Name, pkgs)))
elif options.create_report:
    sys.stderr.write("\n===========================================\n")
    sys.stderr.write("Packages to be upgraded:\n    ")
    pkgs = filter(lambda pkg: depcache.MarkedInstall(pkg) or depcache.MarkedUpgrade(pkg), cache.Packages)
    sys.stderr.write("\n    ".join(map(lambda p: p.Name, pkgs)))
    sys.stderr.write("\n===========================================\n")
    sys.stderr.write("packages needing upgrades: %s\n" % (upgrades))
    sys.stderr.write("security updates: %s\n" % (security_updates))
else:
    # print the number of regular upgrades and the number of
    # security upgrades
    sys.stderr.write("%s;%s" % (upgrades,security_updates))

sys.exit(0)

The above script isn’t quite finished yet.  The report it gives me has some visual artifiacts that could be eliminated from the resulting email.  Here’s a sample report:

Reading package lists... 0%

Reading package lists... 100%

Reading package lists... Done

Building dependency tree... 0%

Building dependency tree... 0%

Building dependency tree... 1%

Building dependency tree... 50%

Building dependency tree... 50%

Building dependency tree

Reading state information... 0%

Reading state information... 0%

Reading state information... Done

===========================================
Packages to be upgraded:
 libssl0.9.8
 openssl
===========================================
packages needing upgrades: 2
security updates: 0

For the moment, I’ll leave the finishing touches as an exercise to the reader.  I’ll make the changes when I have more free time in my schedule, but for the time does the bare minimum that I needed.

Another update…

You may have noticed that I finally got around to porting the site over to WordPress.  It took as long as it did because life has been rather busy for me and spending time on this isn’t really a necessity.

I’m hoping the comment system is a bit more robust than it was with BlogEngine.Net.  Feel free to post comments – I will do my best to keep up with what everyone is saying, however, I’m still sticking with no-links in the comments for the time being.  I would much rather write interesting posts here rather than act as a gatekeeper of the comments, so lets hope it works better this time around.  I basically gave up on the old blog because it was time consuming to keep up with the rather ineffective comment system.

For those of you who may think I’ve sold out and went with the masses on WordPress, I can honestly say that it would be very difficult for me to write an original blog system that has the flexibility and features of WordPress.  Ultimately, there’s a reason it’s so popular and I really can’t complain at all.   Customizing the site is very easy and it just works.

Here’s to hoping I can stop writing meta-posts about blog software and get to something that’s a lot more interesting.

Blog update…

As some of you may have noticed, I haven’t posted in a couple of weeks.  Part of the reason for that is I only want to write things here that are on topic and interesting to you.  A lot of blogs I read have a tendency to get off topic or rant about personal topics which degrades the quality of the site content and I don’t want to do that.  I would rather write one post per month than set a goal to write 7 articles per week that have no value (and yes, some blogs such as the gawker family of blogs do just that – providing quantity rather than quality).

The other piece of it is that it takes time to research hosting providers and implement a blog platform with that provider.  I can’t say I’m terribly impressed with BlogEngine.net (which is what’s used currently on this site), but it works for the time being.  More specifically, there are a lot of problems surrounding IIS’s latest implementation and not all of the blog software out there has been updated to be compatible with the newer, more modular IIS and it can take a fair bit of effort to get things working.  After installing the software, I’ll play around with it for a bit and see if I like it, but in a lot of cases it’s still a bit of work to customize it to my liking.   Even then I might decide it doesn’t measure up and then I end up looking for something else…

I’m also aware that some parts of the site aren’t working right – like the rss feed and the email functions.  I’ll take a look at that in the near future and see if I can get it figured out, but I can’t say it’s high priority at the moment since this was intended to be temporary.  In the mean time, feel free to post a comment – I do read them from time to time.

Update: I’ve set comments to moderated.  There are a lot of posts coming from one subnet and most of them don’t make a lot of sense to me.  I don’t mean to offend anyone, but either your english is so bad that it doesn’t make sense or your post likely has some ulterior motive.  In any case, the comments aren’t helping the quality of the site content, so I felt I had to put an end to it.  Apologies if it really is just a language barrier, though.

Update 2: Ok.  If you want your comment posted, do not put in a website or URL.  After doing some reading about comment spam and then doing some further research into advertising practices, I think I’ve figured out why the comments on my site are so bizzare.  So – my first feature modification will be to eliminate URL posting and that includes the Website field on the comments.  Any comment posted with a URL in the body of the comment or the website field will not be approved.  Sorry, everyone, but that’s the ridiculous world we live in.  A few bad apples really can ruin it for everyone, unfortunately.

Serial Debugging without a UART

I sometimes have projects using AVR microcontrollers – specifically, I use the ATTiny24 quite a lot since it has a fair number of IO lines and has most of the basic features you would expect from an AVR.   The only problem I have with this chip is that it doesn’t seem to have a full UART on it.  I’ve looked through the datasheet at the universal serial IO it has, but it doesn’t strike me as being useful for RS-232 communications.  Besides that, if you have the code I’m going to show you in this post, you can do RS-232 output on any unused pin without the need for a UART, which in my opinion, is a more flexible solution.

(Yes, you can use JTAG if you have the tools for it, but it’s not ideal when all you really need is simple feedback to know what/how your program is behaving.  I actually have the JTAG ICE mkII and I mostly just use it as a programmer if I’m writing my program in C.)

To do this, you need to know about two basic approaches to structuring a program for an embedded system.  First, we’ll be implementing an interrupt driven scheduler that uses a timer.  The timer helps to ensure that the serial output line is changing at the correct rate so that the remote system can understand the data being sent.  In this example, we’ll be trying to communicate at 9600 baud, so we’ll need to have the timer interrupt occur every 0.104mS (1/9600 = 0.104mS).  Each time the interrupt occurs, we’ll change the state of the output pin to signal the next bit to be transmitted.

The second concept we’ll need to implement is a state machine.  Basically, we need to keep track of where we are in the current byte to be sent as well as where we are in the output string.  As each bit is sent, we need to determine if the current byte is  just beginning and we need to send a space (start) bit or if the current byte is ended and we need to send a mark (stop) bit.  (See Wikipedia for an in-depth discussion of the RS-232 protocol.)  The state machine is built around the use of a few variables and control logic so that the information sent complies with the protocol.  Here is the code:

/* Headers needed for this module */
#include <stdio.h>
#include <string.h>
#include <inttypes.h>
#include <avr/interrupt.h>
#include <avr/pgmspace.h>
volatile char msg[32]; // message buffer
//Sample message - note how the string is null terminated so we can use strcpy()
const char testmsg PROGMEM = "Testing Serial IO!\r\n";
// This line is our output for serial data
#define SEROUT PB2
// Scheduler interrupt frequency.  This directly affects the baud rate
//  for serial communications.  Do not change this.
// This was approximately calculated as follows (some tweaking was needed):
//    ((0xFFFF) - ((1 / BAUD) / (1 / F_CPU)))
#define TCNT1_CNT  0xFD00
void init_timer(void)
{
 TCCR1A &= ~(1<<COM1A1) & ~(1<<COM1A0) & ~(1<<COM1B1) & ~(1<<COM1B0) &
    ~(1<<WGM11) & ~(1<<WGM10);
 TCCR1B &= ~(1<<ICNC1) & ~(1<<WGM13) & ~(1<<WGM12) &
    ~(1<<CS12) & ~(1<<CS11);
 TCCR1B |= (1<<CS10);
 // WGM13:0 = 0 for normal mode
 TIMSK1 |= (1<<TOIE1);  // interrupt on overflow
 TCNT1 = TCNT1_CNT;
}
ISR(TIM1_OVF_vect)  // TimerCounter1 overflow interrupt handler
{
 /*  Some of the frequency calculations may need to be tweaked to make
  it so that the routines execute with an acceptable amount of error.
  Since we are bit banging serial communication, the amount of error
  we can have before the serial stream turns to garbage is fairly
  small */
 TCNT1 = TCNT1_CNT; // reset counter for next time
 // This must be done first because the timing is critical
 do_serial_comm();
// ... do other things if need be....
  }
}
void do_serial_comm(void)
{
 /* State machine to output serial data.  This function must be
  called with the proper frequency to output serial data at the correct
  rate.  If the rate varies slightly, the baud rate won't be stable
  enough to ensure the data can be understood by the remote host.
 */
 static volatile uint8_t bitPos = 8;
 static volatile uint8_t charPos = 0;
 static volatile uint8_t needStopStart = 0;
 static volatile char curChar;
 /* The order of the following blocks is critical.  For this
  state machine to work correctly, we have to do something
  with the transmit line every time this function is called.
  The operation to be performed depends on where we are in
  current character and the message itself.
 */
 // send serial data
 if (needStopStart > 1)
 {
  PORTB |= (1<<SEROUT);    // logical mark output
  needStopStart--;
 }
 else if (needStopStart > 0)
 {
  PORTB &= ~(1<<SEROUT);   // output a start bit
  needStopStart--;
 }
 else if (bitPos < 8)        // LSB first
 {
  if ((curChar & (1<<bitPos++)) == 0) // bit is 0
   PORTB &= ~(1<<SEROUT);
  else       // bit is 1
   PORTB |= (1<<SEROUT);  
 }
 else if (charPos < strlen(msg))
 {       // move to the next byte
  curChar = msg[charPos++];
  bitPos = 0;
  needStopStart = 3;    // Sends 2 idle bits + 1 start bit
  PORTB |= (1<<SEROUT);   // Idle the line
 }
 else  // only get here if bitpos is 0 and curPos = strlen(msg)
 { // end of string
  strcpy(msg,"");         // blank the message
  bitPos = 8;
  charPos = 0;
  needStopStart = 0;
  PORTB |= (1<<SEROUT);   // Idle the line
 }
}
/* Code */
int main(void)
{
 init_timer();
 strcpy(msg, testmsg); // initialize output message
 sei();    // enable interrupts
 
 for (;;) {}  // loop
}

Some of the above code may look confusing, but it works quite well.  Our cpu is configured to run at 8MHz from the internal oscillator.  Since that timing source is going to drift from chip to chip, some tweaking of the counter overflow value may be needed to get the timing of the serial line changes to work correctly.

In the do_serial_comm() call, the needStopStart value is used to output the necessary stop, idle, and start bits that have to occur in between each byte.  Since the sequence in between each character is always the same – a stop bit, idle bit, and a start bit – we can simply decrement needStopStart each time the function is called to get the correct output.  When we are sending character bits, we simply test the current bit to see if it’s a 0 or a 1 and then send the proper logic level on the output pin.

When the message buffer is exhausted the message buffer is set to a blank string and the remaining state machine variables are reset.  The remaining program code can then test the length of the message buffer to figure out if the current message has been completely sent before placing a new message into the buffer.  Testing for a blank message buffer has to be done each time the program wants to send a new message.  If a new message is put into the buffer before the previous message has been completely sent, the output may be corrupted.

This technique works best if your program has a lot of time in between status updates.  In my case, I used this code in a NiCD battery charger I built and I wanted the charger to be able to log the battery voltage on the computer once per second.  So long as the message is short and the check for the battery voltage isn’t happening too quickly this method is very reliable.  If on the other hand, you’re doing something 1000 times per second, you’ll find this to be very limiting because you simply can’t send messages that fast at this baud rate.  Also the baud rate is also a limiting factor since the amount of acceptable timing error is reduced as you try to communicate at faster speeds.

Anyway – I hope this information was helpful.  If you spot an error, please let me know in the comments.

Undeleting a recording in mythtv and mysql wizardry

MySql is a registered trademark of Sun Microsystems, Inc.

Disclaimer: This documentation is not official or authorized in any way by any third parties.  Use the information at your own risk.

I’ve been lucky enough to be able to live without an undelete feature on my computer for years.  For the most part, I haven’t had many problems – usually I have a backup copy of the file I can work with.  In this case, I had a backup, but the problem was it wasn’t just a file that needed to be restored.  I was trying to delete a recording in my mythtv system and I managed to hit the arrow key at the same time I hit the menu key.  Without realizing it, I had deleted a recording from the system that I hadn’t watched yet.  If that wasn’t bad enough, I also don’t have the system set up in such a way that I can undelete any recordings.  Since I have the actual video file in a backup location and since I have running backups of the mysql database, I decided to take a quick look at how I might be able to recover the recording.

Putting the recording back into the storage group is by far the easiest part of the process.  The much larger problem I had to deal with is the database and I actually expected the task at hand to be a lot harder than it turned out to be.  What I wasn’t prepared for, however, is just how much data mythtv stores about a single program.  To give an overview, here are some of the more important tables (as of 0.21-fixes):

  • recorded – This table contains the program start time, end time, title, subtitle, description and a few other pieces of information.
  • recordedmarkup – Contains the bookmarks used by the automatic commercial skip feature
  • recordedseek – A huge table containing seek information for each 1/2 second of recorded video.  A one hour program has about 13,000 entries on my particular system.
  • oldrecorded – This table contains much the same information as the recorded table, but it’s only used by the scheduler.
  • recordedprogram – Contains in-depth program details for each recording in the system.
  • recordedrating – Contains rating information for each recorded program
  • recordmatch – A mystery table.  I assume this is used by the scheduler based on what I see of the table’s columns, but the documentation in the wiki doesn’t describe it. (http://www.mythtv.org/wiki/Category:DB_Table)

…and that’s all of the places where data is changed when a program is deleted as far as I could figure out.  The information about the first three tables I got from the backend server source code in mainserver.cpp:DoDeleteInDB().  For my particular situation, I created a new database on another Linux system and restored the database to the state it was in prior to me deleting the program.  I then spent some time comparing the actual database to my restored database to figure out where the changes were.  For each place where I found data had been altered or deleted, I used mysqldump to create a script I could use to bring the data back into my live system.

mysqldump is the  reason I’m writing this post.  Sometimes it’s easy to overlook the command line possibilities, but they sometimes contain some very powerful features.  In this case, I needed to dump out large amounts of information in a format that could be imported to the other database without having to do a lot of editing of the dump file to get the correct results.  As a rule of thumb, the more you can eliminate human involvement in a process, the less mistakes you’ll make along the way – so editing dump files was something to be avoided.

The primary information needed for this is the chanid and the starttime of the program.  Once you have that information, you can use it to select the information you want when you make the dump from the database.  Here is an example:

$ mysqldump –skip-opt –complete-insert –no-create-info –order-by-primary mythconverg –tables recordedseek –where=”chanid = 1311 and starttime = ’2010-01-14 21:00:00′” -u root -p >>program.sql

In this example, most of the options cause mysqldump to only output insert statements and the reason behind that is I didn’t want or need to drop or create any tables which may have caused a much larger loss of data.  The –tables parameter tells mysqldump which tables you want in the output file.  The –where parameter allows you to select specific information from the table so that you don’t get any information you don’t need or want.  This is the key to using mysqldump effectively when you need to do a partial restore of information between two databases.  I repeated the above command for each table in the list above and created a script that could be run on the live system to import the information back into the live database.  Restoring the database information is easy:

$ mysql -u root -p mythconverg <program.sql

After entering a password and waiting a few seconds, I simply restarted mythfrontend and all of the information about the recording showed up in the watch recordings section of mythtv.  Playback is just as I would expect if I hadn’t accidentally deleted the recording, so I’m going to consider this a success.

SQL Server 2005: Logon Error 18456, State 11

First, I would like to make a request of anyone reading this.  If you post in some forum somewhere about a problem you’ve been experiencing, please take a few moments to also post what the solution was after you’ve gotten your problem figured out.  I spent a good amount of time researching why I was getting the error in the title of this post and while I could easily find posts online describing the issue, it seemed like almost no one took the time to post the solution.  Those of you who do post your solutions, thank you.

This week we’ve been working on installing VMWare VDI/View so that we can do an evaluation of the product.  Part of that installation process involves setting up a SQL Server database for the vSphere and View pieces of the software and though we could have used MSDE or whatever MS is calling it these days, we wanted to mirror a production environment as closely as possible.  This means that we wanted to use SQL Server 2005 as the back end database and use an Active Directory service account to grant access to the database which is the best practice standard in our environment.

After configuring the account in the AD and setting group memberships, I proceeded to add the new service account to the database server, create a database for the user, make the service account the database owner, and grant further access to the msdb system database.  All of that seemed to work perfectly until it was time to create a data source so the vSphere could access the database.  Suddenly, the service account couldn’t connect and the above error was appearing in the activity log on our SQL Server.

Of course, the above error means that the server was able to verify the account as valid, but had denied the logon attempt.  (Note to the MS SQL Server Product Team: this error is useless without knowing WHY the logon was denied.  Makes me wonder who did the QA on this product…) I spent some time searching online forums for a solution and I actually found one post that mirrored my situation exactly.  The only problem was the original person who posted it didn’t mention if others had helped solve the problem.  Instead, they came to the forum, posted their problem, got some advice, and disappeared into the ether to never be heard from again.  I didn’t want to follow the advice in the thread because it seemed like I’d be going in circles.  After all, I just created the account, added it to the server, and granted rights to the account only moments prior to this.  How could anything be different?

In an attempt to isolate the issue, I added one of my test accounts to the server and made it the owner of the database.  I then went back to the vSphere server, added the test account to the local administrators group, and logged in under the test account.  When I then tested the connection to the database, it worked as expected. Since I couldn’t figure out what was different between the test account and the service account, I ended up deleting the credentials from the SQL Server management console, removing all of the rights I had granted, and then finally adding all of that information back to the server.  Strangely enough, that worked.

I have no explanation as to why that worked as a solution, but next time I run into it, this is the first thing I’m going to try.   Hope it helps.

Web Hosting Providers…

About a week ago, I suddenly discovered that my hosting account was suspended. Apparently at the end of November, my credit card expired and that’s about the time the site was suspended. You might then wonder why ti took over a month to get the site back online and the answer to that is my hosting provider didn’t exactly do their due diligence to let me know about the situation. I’m not naming names, but if you’re reading this – you know who you are.  No, it wasn’t GoDaddy – I cancelled my account with my prior hosting provider because we couldn’t come to an agreement.

So – that’s how we got to where we are today. I’m playing around with some different blog software – mostly to see what’s currently out there. I still have my original blog software and all of the information that was on the site so I’m planning on bringing it back at some point. However, in the mean time, it doesn’t hurt to see if there’s something I like better that’s also open source.

One thing to note is that since I really only just started with this software, there are some things that probably won’t work correctly out of the box. I’m sure we’ll get it all ironed out soon enough though…