Convert HTML emails to plain text and trim for SMS text messages

I forward emails from clients to my phone via SMS, i.e. mynumber@txt.att.net. Initially I thought something was wrong on my carrier's side as I was only sporadically receiving them on my phone. However, it turned out these were HTML emails and the SMS gateway will not deliver those. Also, the tendency to reply with very long quotes results in excess texts as the limit is 160 characters per text message. Here is how to fix those two problems using procmail.

CPanel PCI Compliance SSL Ciphers

There is a lot of outdated/bad information out there on what to do for proper PCI compliance on CPanel. This is what recently worked for me:
  1. In CPanel go to Main >> Service Configuration >> Apache Configuration >> Global Configuration
  2. Change SSLCipherSuite to ALL:!ADH:RC4+RSA:+HIGH:+MEDIUM:-LOW:-SSLv2:-EXP:!kEDH
  3. For the next 4 settings select the PCI recommended option from the drop down lists, save, rebuild configuration.
If the scan still doesn't pass CPanel's documentation site lists additional settings to check.

Limit Frontpage shtml.exe cpu usage

One of my clients using CPanel had a customer hogging all the CPU on the server due to his Microsoft Frontpage usage with the shtml.exe process. I wrote this script using cpulimit to limit its cpu usage and ran it via cron every 5 minutes:

#!/bin/sh
PROC='cpulimit'

if ! ps ax | grep -v grep | grep $PROC > /dev/null
then
/usr/local/sbin/cpulimit --exe /usr/local/frontpage/version5.0/exes/_vti_bin/shtml.exe --limit 3 &>/dev/null
fi

Using find on Linux

Basic search with find:
find /directory -name 'search term'

Search files in subdirectories of current directory:
find . -name 'search term'

Find large files:
find / -type f -size +100000 -exec ls -lh {} \;

Search inside file contents:
find . | xargs grep --color=auto -iR 'search term' *
or
find . -type f -exec grep -lR 'search term' {} /dev/null \;

Find files modified in certain time period:
find . -name 'search term' -mtime -1 -print

Exclude certain directories in search:
find . -name 'search term' -prune -o -name 'excluded directory1' -prune -o -name 'excluded directory2' -prune -o -type f -print

Find files older than a certain date (2 days in example) & delete:
find . -mtime +2 -exec rm {} \;

Find and replace in files:
find ./* -type f -exec sed -i 's/search term/replacement/g' {} \;

Automated MS SQL database backup using command line

A customer wanted automated backups for their MS SQL database. Unfortunately MS SQL management can be a little tricky since it doesn't have something straight forward like phpMyAdmin for MySQL.
  1. Load SQL Server Management Console or Studio Express
  2. Select New Query
  3. Enter the query:
    backup database databasename to disk='C:\db.bak'
  4. Save query to C:\backupdb.sql
  5. Create a new .bat file by right clicking in Windows Explorer, select New Text Document, then rename to backupdb.bat
  6. Enter the command:
    sqlcmd -S .  -i "C:\backupdb.sql" > "C:\backupdb.log"
  7. Navigate to Control Panel, Scheduled Tasks, Add New Task, and select Command Prompt
  8. Change the task path to backupdb.bat, start path to C:\ and set how often you want to backup

Thanks to Norman for the pointers.

Script to sort & show directory sizes in Linux

Just change /www/htdocs/ to the desired directory that you want to see the subdirectory sizes for, sorted by size descending.

#!/bin/bash
du -k /www/htdocs/* | sort -nr | awk '
     BEGIN {
        split("KB,MB,GB,TB", Units, ",");
     }
     {
        u = 1;
        while ($1 >= 1024) {
           $1 = $1 / 1024;
           u += 1
        }
        $1 = sprintf("%.1f %s", $1, Units[u]);
        print $0;
     }
    ' > filesize.txt

Block Advertisements in Gmail Or Anything Else

I highly recommend Gmail if you haven’t checked it out already. The web interface allows you to access your email from anywhere, I rarely get any spam in my Inbox, there’s ton of free storage space, and email conversations are threaded making it easier to keep track of messages (i.e. you see a single email rather than 50 “Re: Hello” messages).

Unlike Yahoo Mail, there are no advertisements appended to the emails you send. Instead, Google brings in revenue from their “sponsored ads” displayed on the sidebar of Gmail’s interface. But with a few addons for Firefox, you can completely remove these. The only real downside of Gmail is the security concerns of Google scanning all your emails to display contextual ads. But I think these concerns can apply to anything on the internet. Check out FirePGP for encryption if you are worried.

Google has recently began to roll out rolled out a new version of Gmail that improves speed dramatically, but can break many old addons and scripts, including those that block ads in Gmail. But blocking them yourself is simple, requiring only a few steps.

NOTE: the Better Gmail 2 addon add on blocks ads and has many other useful features. Only do the below if you don't want to use it.

  1. Install Firefox, a web browser that’s safer and more customizable than M$ Internet Explorer.
  2. Install the Adblock Plus addon. Wait to restart Firefox after step 3.
  3. Install the Adblock Plus Element Hiding Helper addon, which helps select special blocks of content to hide. Restart Firefox after it’s installed.
  4. Subscribe to the EasyList filter, which tell Adblock how to block ads.
  5. Bring up Adblock’s Element Hiding selector by right clicking on the red “ABP” icon and selecting “Select element to hide.” Or press down the shortcut keys CTRL+SHIFT+K.
  6. When the red selector comes up, click the area on the right where the ads are above “Sponsored Links” right below “Print all.”
  7. Click “Add filter rule.”

Simply installing Adblock Plus removes most ads, and the Element Hiding Helper can be used to take out any that remain.

How I Used Drupal to Build Tampa Bay Indymedia

The following is how I used Drupal 5, the popular CMS, to create the website for the Tampa Bay Independent Media Center in 2007. While my goal was a community-based news site, I believe this can be useful for anyone looking to create a dynamic yet simple and clean website. No knowledge of HTML, CSS, PHP or other markup/programming language is basically necessary, as I was able to build this entire website by simply configuring Drupal from the graphical administrative interface (besides some small optional hacks for aesthetical purposes).

Installing Drupal is very straightforward. Many hosts even come with Fantastico, which can automatically install Drupal. Many times it does come with an outdated version however, so I would recommend just installing manually. This can be easily done by checking out this brief how-to (if you don’t want to mess with SSH or don’t have shell access, you can skip step 1 and simply unpack the files on your local computer with WinRAR and upload them with a FTP client to your host).

I use A2hosting.com, which has been around for a while, seems to have decent prices and good support (I have since used the “Walmarts” of hosting, Godaddy and Dreamhost, and found that while A2hosting.com is a few bucks more expensive, it is noticeably better in terms of server response and tech support). I also found this 20% discount coupon, “20PERCENTOFF,” which can be entered at checkout. For Windows, I recommend the free FTP client FileZilla for uploading files to your host and HTML-Kit for editing files once you upload them to your host (to edit files on your host go to Workspace>>Add Folder / FTP Server in HTML-Kit).

A Brief Introduction to Drupal Terminology

Drupal appeared a little overwhelming at first. Not only does it have a myriad of options and potential customizations, but it also uses a lot of proprietary terms. Below I will attempt some clarification, but don’t worry about immediately mastering these:

  • Nodes: a unit of generated content, like a page or post
  • Modules: plugins that are created by the Drupal community and hosted on Drupal’s website
  • Theme: a “skin” that affects the graphical rendering of the site (colors, layout, etc.)
  • Block: a section of the website that is able to contain content; by default there is a navigation sidebar and a top menu
  • Taxonomy: Drupal’s implementation of categories, featuring customizable hierarchy (like sub-categories or mere lists) and symmetry (related categories)

Selecting a Theme

A theme that is simple and well-coded will make future customizations much easier. Since Drupal is continually being updated to fix bugs and add new features, an up to date theme is important as well. Themes are hosted at drupal.org/project/themes and themegarden.org has live previews of many themes. Once you have a selected theme, upload it to /public_html/sites/all/themes/themename on your host.

Recommended Modules

One of the strongest points of Drupal is its customizability. There are hundreds of community-developed modules, extending many features and allowing much flexibility. I would recommend using the latest dev version of modules if available, since they usually include the most recent bug fixes. When you download a module, make sure you check the README.txt so you don’t miss any required installation or configuration instructions (like enabling permissions, see the next section). The modules I am currently using are:

  • Actions: automatically executes an action, like sending an email, when certain conditions are met
  • Administration Menu: rather than using the default navigation menu to access the administrative back end, this creates a small and fast AJAX version of the administration menu located on the top of your site. To enable with my theme, I had edit my theme’s page.tpl.php file per the module’s instructions.
  • Block Cache: Drupal has caching built in for anonymous users, but this allows the caching of blocks. I use it for my block that contains javascript to make a collapsible list of all the collectives in the Indymedia network.
  • Calendar: used with the Views and Event module for a calendar of events
  • Content Construction Kit: aka CCK, allows custom content types (like articles and events) and custom fields (like a “source” text box for users to provide the original link from where they got an article)
  • Comment Upload: provides the ability to add attachments in a comment
  • Date: necessary for the Event module
  • Embedded Field: builds off of CCK to allow media like You Tube videos to be embedded
  • Event: creates a custom “event” content type
  • Form Store: used with MyCaptcha to put those annoying tests on custom content types (events and articles) to stop automated spam bots
  • Form Filter: removes unwanted things from forms without hacking code
  • Hidden: rather than deleting unwanted user-submitted content, this “hides” (i.e. moves) content either by filters, user reporting, or manually. Since the content remains accessible via a “Hidden” link in the navigation menu, it’s a good way to ensure transparency and prevent abuse of admin privileges.
  • Javascript Tools: a collection of modules that enable AJAX in various ways, I use it mainly for the AJAX search
  • Location: allows location info to be entered for an event (or any other content type) and then generates a link to a google map
  • Login Menu: adds a Login link on the navigation menu that shows just for users that are not logged in
  • MyCaptcha: see the Form Store module above
  • Override Node Options: creates a permission for a special class of users (defined via roles) to publish a story to the front page, which is disabled for all users except admins by default
  • Pathauto: automatically generates a custom URL alias based on the title of the content, i.e. rather than http://sitename.com/node/234/ a story titled “U.S. Government Crumbles” will be http://sitename.com/article/us-government-crumbles
  • Persistent Login: adds a “Remember me” option
  • Safe HTML: filters posts to make them look nice
  • Search 404: automatically performs a search and redirects to a match rather than display 404 error for pages not found
  • Similar by Terms: creates a “Similar” block below articles
  • Taxonomy Manager: uses AJAX to overhaul the clunky interface for managing Taxonomy
  • TinyMCE: a rich text editor replacement (I now like FCKeditor better for its easier setup and customizations)
  • Update Status: checks for updates for most modules
  • Upload Preview: adds a preview when attaching images and then displays the preview rather than just listing attachments
  • Views: creates all kinds of custom blocks and pages. I use it for the Eventlist and Newswire
  • Vote Up Down Package: adds digg-like voting. I use this to automatically promote a story to the front page if it receives a certain number of yes votes.
  • Voting Actions: used in conjunction with the above
  • XML Sitemap: creates a sitemap to for better interaction with search engines

Creating Content Types and Taxonomy

After installing the modules, I created an ‘article’ and ‘event’ content type in Content management>>Content type. I changed “Default options” under “Workflow” to “Published” and “Create new revision” so when an article or event is submitted it will go to the Newswire/Eventlist on the side and create a new revision by default when it is edited, allowing anyone with the “view revisions” permission to see any changes.

In Content management>>Categories, I created a ‘topic’ and ‘issue’ categories for articles and events. Using the Taxonomy Manager module in Content management>>Taxonomy Manager (if you can’t see this see below for adding the proper permissions), I added terms or sub-categories for each. See my list here.

Roles and Permissions

Many modules require the correct permissions to administer or even just work. The first thing I did was create an admin role in User management>>Roles and checked all available permissions in User management>>Access control. I also gave anonymous users all the available view/access permissions (except “access administration pages” of course) and create permissions for articles, events, and forum topics, except for editing their own content. This is because Drupal does not distinguish between one unauthenticated user from the next (there is a hack somewhere to fix this if you really want to though). I gave all the same permissions for authenticated users plus allowing them to edit their own content.

Using Views to Create Dynamic Blocks

I created the Newswire, a list of incoming user-submitted stories, by going to Site building>>Views>>Add. In the page view of the Newswire, I set view type to teaser list and used a pager so it would not list all the stories on a single page. For the block, I used a list view with a ‘more link’ as well as fields set to node: title, filters to node: front page equals no, node type is article, and node: published equals yes, and sort criteria set to descending.

For the Eventlist, I just provided a block, setting the page view as a calendar separately. View type is list view with a ‘more link’ as well as fields set to node: title and event: start time as short date, arguments to calendar: year display all values, filters to node: type is event and event: start date is greater than now, and sort criteria set to event: start time ascending. To properly display the calendar of events, I enabled the default calendar view located under “Default Views” in Site building>>Views>>List, adding the field event: start time as short date and node: type is event filter, removing the filter node: published equals yes.

Maintenance

Keeping the site well-oiled can prevent potential interruptions and headaches. Cron must be ran every so often to index all the content for Search and Aggregator (used for RSS feeds). If caching is enabled, cron will also refresh this. Through my host’s cPanel, I set cron to run every 4 hours with the command "curl –silent –compressed http://tampabay.indymedia.org/cron.php".

Keeping modules up to date is key to ensure the site is always performing well and necessary to patch bugs. The Update Status module automatically checks for updates after cron runs, which can be viewed in Logs>>Available updates.

When updating a module, simply delete the old version’s folder (ensuring any customizations are backed up) and upload the new version with a FTP client. Be sure to check Logs>>Status report in case Drupal’s database needs to be updated by running update.php.

Conclusion

There are many more minor settings and customizatons I did to get Tampabay Indymedia up to its current state, but many of these are unique to the site’s purpose and my aesthetical tastes. The above guide should be sufficient to get a nice site up and running. If not, please feel free to comment below. Also, I have found the documentation and many people in the community at drupal.org to be extremely helpful. Good luck and have fun with Drupal!

Rotate Apache (or any) logs on linux

Logs can quickly grow to take up disk space. To alleviate this problem, install logrotate if your distribution does not come with it already. Edit /etc/logrotate.conf changing the first line to the period when you want to rotate the the logs (daily, weekly, monthly, yearly). Change the rotate command under that to how long you want to keep old logs for, e.g. if rotating logs every month and to keep the logs for a year use rotate 12.

Edit (or create if necessary) /etc/logrotate.d/httpd to contain the following:

/usr/local/apache/logs/*_log {
 monthly
 rotate 12
 compress
 missingok
 notifempty
 sharedscripts
 postrotate
 /usr/local/apache/bin/apachectl graceful
 endscript
}

Where:

  • monthly: time period elapsed before the logs are rotated
  • rotate 12: logs are rotated 12 times before being removed (i.e. 1 year), if set to 0 old logs are removed rather then rotated
  • compress: old logs are compressed with gzip to save disk space
  • missingok: if logs are missing skip and do not issue an error
  • notifempty: do not rotate empty logs
  • sharedscripts: only run postrotate command once for all the logs in the specified directory, rather for each log
  • postrotate: command to run after logs are rotated, necessary for apache to start logging again

Be sure the paths /usr/local/apache/logs/*_log and /usr/local/apache/bin/apachectl graceful are correct for your configuration. Graceful will restart apache without ending any current requests being served.

Thanks to nixCraft for the hints.

Disable all catchalls on entire cPanel server to prevent spam

By default cPanel is set to accept catchalls, that is mail to non-existent users, and bounces them. This can result in much spam being accepted by a cPanel server as spammers often brute force or randomly address their spam. Further, the bounce is usually set to an innocent address that was spoofed, creating what is an increasing problem known as backscatter spam.

A few steps are required to completely fix this. First disable this default setting in cPanel WHM by going to Server Configuration > Tweak Settings > Mail > and set Default catch-all/default address to :blackhole:. This will silently drop spam rather than bounce it, preventing more backscatter spam.

Next, disable all catchalls on the server:

mkdir -p /etc/valiasesbak
cp -R /etc/valiases /etc/valiasesbak
sed -i 's/^\*: [^ ]*$/*: :blackhole:/g' /etc/valiases/*
replace ':fail: No Such User Here' ':blackhole:' -- /etc/valiases/*

Check if there are any lingering aliases set to bounce with:

grep '*:' /etc/valiases/* | egrep -v ':blackhole:'

There maybe a few other bounce fail phrases like “Invalid e-mail address. Check and re-send.” Simply substitute these phrases in the replace command above, so:

replace ':fail: Invalid e-mail address. Check and re-send.' ':blackhole:' -- /etc/valiases/*

Ensure users can write with:

chmod 777 /etc/valiases/*
chown nobody:nobody /etc/valiases/*

Lastly, prevent users from re-enabling the catchall. In WHM > Packages > Feature Manager, select Default under Edit a Feature List and then edit. Uncheck Default Address Manager and then save.

Thanks to Fidget for some help.

Simple automatic Drupal upgrade script

UPDATE: Drush has added support to upgrade Drupal core with the 'drush up' command, which I now utilize. I will keep the method below for those who can't use Drush if your host does not allow SSH.

Here's what I do for a simple Drupal core upgrade (from my root directory):

tar -cvzf drupal20100503.tar.gz drupal/
wget http://ftp.drupal.org/files/projects/drupal-6.16.tar.gz
tar zxvf drupal-6.13.tar.gz
./drupal_upgrade.sh
drush updatedb
drush cache clear

I tweaked Justin’s drupal update script slightly. The drupal_upgrade.sh script I use:

#!/bin/bash
## Drupal Automatic Upgrade Script
## Be sure DRUPALDIR, BACKUPDIR, DBUSER, DBPASS, DBNAME are correct
TIMESTAMP=`date +%y%m%d%H%M`
BACKUPDIR=drupal_backup_$TIMESTAMP
## Drupal directory (relative to script)
DRUPALDIR='drupal'
## Database config
DBUSER='user'
DBPASS='pass'
DBNAME='drupaldb'
## Backup Drupal files
mkdir $BACKUPDIR/
cp -pr $DRUPALDIR/includes/ $BACKUPDIR/
cp -pr $DRUPALDIR/misc/ $BACKUPDIR/
cp -pr $DRUPALDIR/modules/ $BACKUPDIR/
cp -pr $DRUPALDIR/profiles/ $BACKUPDIR/
cp -pr $DRUPALDIR/scripts/ $BACKUPDIR/
cp -pr $DRUPALDIR/sites/ $BACKUPDIR/
cp -pr $DRUPALDIR/themes/ $BACKUPDIR/
cp -p $DRUPALDIR/cron.php $BACKUPDIR/
cp -p $DRUPALDIR/index.php $BACKUPDIR/
cp -p $DRUPALDIR/robots.txt $BACKUPDIR/
cp -p $DRUPALDIR/update.php $BACKUPDIR/
cp -p $DRUPALDIR/xmlrpc.php $BACKUPDIR/
cp -p $DRUPALDIR/.htaccess $BACKUPDIR/
## Backup Drupal database
mysqldump -u $DBUSER -p$DBPASS $DBNAME > $BACKUPDIR/$DBNAME.sql
## Remove old and copy new files
rm -r $DRUPALDIR/includes
cp -pr drupal-6.*/includes/ $DRUPALDIR/
rm -r $DRUPALDIR/misc
cp -pr drupal-6.*/misc/ $DRUPALDIR/
rm -r $DRUPALDIR/modules
cp -pr drupal-6.*/modules/ $DRUPALDIR/
rm -r $DRUPALDIR/profiles
cp -pr drupal-6.*/profiles/ $DRUPALDIR/
rm -r $DRUPALDIR/scripts
cp -pr drupal-6.*/scripts/ $DRUPALDIR/
#chmod -R +w $DRUPALDIR/sites/default
#rm -r $DRUPALDIR/sites
#cp -pr drupal-6.*/sites/ $DRUPALDIR/
rm -r $DRUPALDIR/themes
cp -pr drupal-6.*/themes/ $DRUPALDIR/
rm $DRUPALDIR/cron.php
cp -p drupal-6.*/cron.php $DRUPALDIR/
rm $DRUPALDIR/index.php
cp -p drupal-6.*/index.php $DRUPALDIR/
rm $DRUPALDIR/robots.txt
cp -p drupal-6.*/robots.txt $DRUPALDIR/
rm $DRUPALDIR/update.php
cp -p drupal-6.*/update.php $DRUPALDIR/
rm $DRUPALDIR/xmlrpc.php
cp -p drupal-6.*/xmlrpc.php $DRUPALDIR/
rm $DRUPALDIR/.htaccess
cp -p drupal-6.*/.htaccess $DRUPALDIR/

Get Redhat updates without subscription, fix cPanel “No method to auto repair package system” EasyApache error

A Red Hat 4 server could not rebuild Apache or PHP with cPanel’s EasyApache. It kept getting a no method to auto repair package system error. cPanel support said this was due to the Red Hat subscription expiring.

Instead of paying for another $375 subscription just to get some updates, I migrated the server to the corresponding Centos 4.x release. I then made the proper packages were excluded in the first yum [main] block:

exclude=apache* bind-chroot courier* dovecot* exim* httpd* mod_ssl* mysql* nsd* perl* php* proftpd* pure-ftpd* ruby* spamassassin* squirrelmail*

and then yum list and yum update worked.

Install Ubuntu without CD over network with ISO

There are a few ways to install Ubuntu without a CD. The following worked for me using an ISO:

Plesk error: mail server requires authentication to send to a non-local e-mail address

Funny how most of the odd problems that I run into have to do with Plesk. This error randomly started happening one day when trying to send email to any user on a Plesk windows server:

503 This mail server requires authentication when attempting to send to a non-local e-mail address. Please check your mail client settings or contact your administrator to verify that the domain or address is defined for this server.

Upgrading PHP on Plesk Virtuozzo VPS

The CentOS 5 EZ template released by the Russian mafia at Parallels contains an old version of PHP and is yum-less. The first step to upgrade this is to install yum with this huge number of packages (thanks Sohail).

Next install the great atomicrocketturtle repositories:

wget -q -O - http://www.atomicorp.com/installers/atomic |sh

And then finally update PHP:

yum upgrade php

If upgrading from PHP 4, there maybe some additional config files to mess with. See atomicturtle’s wiki page on PHP for troubleshooting.