Clean up content pasted from Microsoft Word in php

This function was found here (, but is something that I regularly need to use which is why I’m posting it to my blog. If you want to add something to this function or use it then feel free.

    function strip_word_html($text, $allowed_tags = '<b><i><sup><sub><em><strong><u><br>') 
        //replace MS special characters first 
        $search = array('/&lsquo;/u', '/&rsquo;/u', '/&ldquo;/u', '/&rdquo;/u', '/&mdash;/u'); 
        $replace = array(''', ''', '"', '"', '-'); 
        $text = preg_replace($search, $replace, $text); 
        //make sure _all_ html entities are converted to the plain ascii equivalents - it appears 
        //in some MS headers, some html entities are encoded and some aren't 
        $text = html_entity_decode($text, ENT_QUOTES, 'UTF-8'); 
        //try to strip out any C style comments first, since these, embedded in html comments, seem to 
        //prevent strip_tags from removing html comments (MS Word introduced combination) 
        if(mb_stripos($text, '/*') !== FALSE){ 
            $text = mb_eregi_replace('#/*.*?*/#s', '', $text, 'm'); 
        //introduce a space into any arithmetic expressions that could be caught by strip_tags so that they won't be 
        //'<1' becomes '< 1'(note: somewhat application specific) 
        $text = preg_replace(array('/<([0-9]+)/'), array('< $1'), $text); 
        $text = strip_tags($text, $allowed_tags); 
        //eliminate extraneous whitespace from start and end of line, or anywhere there are two or more spaces, convert it to one 
        $text = preg_replace(array('/^ss+/', '/ss+$/', '/ss+/u'), array('', '', ' '), $text); 
        //strip out inline css and simplify style tags 
        $search = array('#<(strong|b)[^>]*>(.*?)</(strong|b)>#isu', '#<(em|i)[^>]*>(.*?)</(em|i)>#isu', '#<u[^>]*>(.*?)</u>#isu'); 
        $replace = array('<b>$2</b>', '<i>$2</i>', '<u>$1</u>'); 
        $text = preg_replace($search, $replace, $text); 
        //on some of the ?newer MS Word exports, where you get conditionals of the form 'if gte mso 9', etc., it appears 
        //that whatever is in one of the html comments prevents strip_tags from eradicating the html comment that contains 
        //some MS Style Definitions - this last bit gets rid of any leftover comments */ 
        $num_matches = preg_match_all("/<!--/u", $text, $matches); 
              $text = preg_replace('/<!--(.)*-->/isu', '', $text); 
        return $text; 

Javascript Calendars

I found a few jquery calendars that I really liked. I’ve left a few comments below, but the one I ended up using is the eventCalendar from I had an issue making it react correctly to page changes and resizing correctly on load so I added width: 100%; to .eventsCalendar-monthWrap in the css.

Favorable hosting providers

Here are some hosting provides that I’ve had some really good experiences with.

  • has an awesome way to ramp up and down your website depending on traffic and cost
  • I love having root access to my servers and the VPS from rackspace is amazing. I can spool up a server and work on it for a while and then destroy it as soon as I’m done. I can also ramp up my server on demand (with minimal downtime, between 3 – 30 minutes). Also their support is bar-none the best I’ve every had.
  • great Ruby on Rails hosting for beginners or low traffic sites. I like that I still have ssh access to the server and can use git to push my site to production.
  • I’ve only briefly used amazon, and left it for, but I enjoyed them while I used them. Their free account was alright, but as soon as I wanted to up the server to a production server I was able to get a better price at rackspace.
  • Just found this one, never used it, but I really like the idea of an always load-balanced hosting provider.

Javascript Uploaders

Every once in a while it will make your life so much simpler to use a javascript file uploader. Here are a few that I’ve found. Some of them are free and some you have to pay for. But at least this is a good starting place

Web sliders

Here is a mashup of all the sliders that I’ve found on the web. I and up looking for these over and over again, so I figured I’d just post them and help spread the love! Feel free to suggest any to add to this list.

hover previous next buttons

No internet explorer

Not super awesome

What about one of these

Simple… cool… $12…

Check the directionBottom/directionTop/directionRight/directionLeft or any others of course

Might not work in IE10 at the moment

$12 very cool!

State Abbreviations

I’m always searching for these so I’m just posting them here. As I use them in different languages I’ll post them here preformatted for those languages. If you feel like contributing post them in the comments and I’ll put the translations in this post. Thanks!!


{‘Alabama’ => ‘AL’,’Alaska’ => ‘AK’,’Arizona’ => ‘AZ’,’Arkansas’ => ‘AR’,’California’ => ‘CA’,’Colorado’ => ‘CO’,’Connecticut’ => ‘CT’,’Delaware’ => ‘DE’,’Florida’ => ‘FL’,’Georgia’ => ‘GA’,’Hawaii’ => ‘HI’,’Idaho’ => ‘ID’,’Illinois’ => ‘IL’,’Indiana’ => ‘IN’,’Iowa’ => ‘IA’,’Kansas’ => ‘KS’,’Kentucky’ => ‘KY’,’Louisiana’ => ‘LA’,’Maine’ => ‘ME’,’Maryland’ => ‘MD’,’Massachusetts’ => ‘MA’,’Michigan’ => ‘MI’,’Minnesota’ => ‘MN’,’Mississippi’ => ‘MS’,’Missouri’ => ‘MO’,’Montana’ => ‘MT’,’Nebraska’ => ‘NE’,’Nevada’ => ‘NV’,’New Hampshire’ => ‘NH’,’New Jersey’ => ‘NJ’,’New Mexico’ => ‘NM’,’New York’ => ‘NY’,’North Carolina’ => ‘NC’,’North Dakota’ => ‘ND’,’Ohio’ => ‘OH’,’Oklahoma’ => ‘OK’,’Oregon’ => ‘OR’,’Pennsylvania’ => ‘PA’,’Rhode Island’ => ‘RI’,’South Carolina’ => ‘SC’,’South Dakota’ => ‘SD’,’Tennessee’ => ‘TN’,’Texas’ => ‘TX’,’Utah’ => ‘UT’,’Vermont’ => ‘VT’,’Virginia’ => ‘VA’,’Washington’ => ‘WA’,’West Virginia’ => ‘WV’,’Wisconsin’ => ‘WI’,’Wyoming’ => ‘WY’,’American Samoa’ => ‘AS’,’District of Columbia’ => ‘DC’,’Federated States of Micronesia’ => ‘FM’,’Guam’ => ‘GU’,’Marshall Islands’ => ‘MH’,’Northern Mariana Islands’ => ‘MP’,’Palau’ => ‘PW’,’Puerto Rico’ => ‘PR’,’Virgin Islands’ => ‘VI’,’Armed Forces Africa’ => ‘AE’,’Armed Forces Americas’ => ‘AA’,’Armed Forces Canada’ => ‘AE’,’Armed Forces Europe’ => ‘AE’,’Armed Forces Middle East’ => ‘AE’,’Armed Forces Pacific’ => ‘AP’}

Magento Ruby API

Just in case there is someone else out there that is looking for this I have created the bare bones version of a magento api that is written in ruby.

I have used this to batch create a load of products and it should be a good starting place for adding any thing that is supported by the magento api to an existing ruby project.

Thanks for reading!

Click this to download the ruby file

Tools for better emails

There are a few select things that you need to do in order to make and send professional looking html emails. Here are a few of the tricks and tips I’ve picked up along this 1996 tableful, anything but golden, road to beautiful html emails.

One of the first things to do is make sure that you have read the book “Create Stunning HTML Email that just works!” by Mathew Patterson and published by sitepoint. I’ve learned quite a few interesting things in this book and it comes with a few tables and charts that tell you what tags and css is supported by which email clients.

Next you’ll need to create an inlined html version of your email where there is little or no css in the head of the document, but rather everything is put in the body in style tags, and this is a nifty little tool that I’ve been using to help do this HTML Email Inliner

I’ll add more information in here as I’m learning and feeling inspired to receive some of your beautiful html emails!