Aaron K. Murray's Blog

Purpose: Show the world how to build a complete website starting with a single index.html page

RSS icon Twitter icon GitHub icon

Post 29: Understanding Javascript Numbers

View the code changes related to this post on github

Javascript is a great scripting language. It is simple to learn, (obviously) runs in a browser for quick development tests, and is flexible enough to allow for some powerful code to be created. That flexibility does come at a cost however.

One of the confusing aspects of Javascript is type coercion and what that means with respect to dealing with numbers. The problem is not that Javascript is flaky, it is that there are more rules to understand when dealing with a loosly-typed scripting language than a compiled language like C.

Enough horseplay, show me examples

Let's make a simple function that takes a string and left-pads it with zeroes to a certain number of places.

	function zeropad(text, numPlaces) {
		while(text.length < numplaces)
			text = "0" + text;
		return text;

Here is the output with various inputs:

	//Output that I expected
	zeropad("foo", 5);           //returns "00foo"
	zeropad("foo", 1);           //returns "foo"
	zeropad("123", 7);           //returns 0000123
	zeropad("$19.99", 7);        //returns "0$19.99"

	//Output that I might not have expected, or that is a case that I did not plan for
	zeropad(123, 7);             //returns 123
	zeropad(234.56, 7);          //returns 234.56
	zeropad(false, 7);           //returns false
	zeropad([], 7);              //returns "0000000"
	zeropad([1], 7);             //returns "0000001"
	zeropad([1,2,3], 7);         //returns "001,2,3"
	zeropad({}, 7);              //returns "Object {}"
	zeropad(new Function(), 7);  //returns "0function anonymous() { }"

	//Exceptions thrown
	zeropad(null, 7);            //returns TypeError: Cannot read property "length" of null
	zeropad(undefined, 7);       //returns TypeError: Cannot read property "length" of undefined

This illustrates type coercion in action. You can see from the results that the text argument can actually be any type, and the way that variable gets handled when being added to a string is different.

Strings behave as expected with zeroes simply being prepended. Numbers will actually coerce the '0' into a Number however, and result in the value of text never actually changing. Other types behave more-or-less as expected with the exception perhaps being arrays. Since the coercion process will call .toString() on the object, arrays will return a comma-delimited string when this happens, and an empty array will return an empty string, which will then have zeroes prepended to it.

Rounding Numbers

There are many times when whole numbers are needed to be parsed from user input. Tax forms are a good example of this. The government does not care much about pennies on tax forms so they want everything in whole dollars. There are many ways to do this in Javascript. Given a number x, we could use:

  • Math.round(x) //round to closest whole number
  • Math.ceil(x) //round up
  • Math.floor(x) //round down
  • parseInt(x) //parse an int from any type
  • x.toFixed(0) //trim the fractional portion from a known number type
  • x|0 //bitwise OR with 0 which converts x to an int first and returns x
  • x.toString() with substring math...the list goes on and on

Here is a grid showing the results of these methods applied to various types and values as inputs, as well as a few custom toInt() methods I wrote for the akm.math lib:

As you can see from the grid, the results from using these different methods varies based on the data type of the input. The only (built-in) reliable method that will return an actual Number type with a useful (non-NaN) value is "bitwise OR zero" (x|0).

The akm.math functions I solve the problem of both always wanting to return a useful Number, and also have somewhat intelligent parsing of values that include spaces, commas, and dollar signs. Unlike the results returned by parseInt, ex: parseInt('123,456') === 123

Other important notes that may not be obvious:

  • typeof NaN === 'number' //NaN is a numeric type...just not a known number
  • (NaN != NaN) === true //one NaN may not be equal to another NaN...even itself. NaN is basically an unknown number
  • isNaN(parseInt(Infinity)) === true //parseInt(Infinity) returns NaN, not Infinity
  • Not all browsers render these results the same. Especially parseInt(). The default 'base' value is changing from 8 to 10

Working with numbers in Javascript can be frustrating at times. Understanding the data types that you will be working with and what options you have for conversion will help you make good decisions up front and reduce the headache from seemingly random input bugs months after you developed that webform (or game engine).


Post 28: CSS Preprocessing using SASS

View the code changes related to this post on github

It feels like it has been an eternity since I posted last. Turns out that 3 months have gone by - wow! Let's get to it.

This post is on a topic that has been on my mind for a while to talk about here: Preprocessing. Specifically for this post, I will cover CSS Preprocessing using SASS, but the concept can and should be applied to much more than just CSS.

What is CSS Preprocessing?

Put simply, CSS preprocessing is a way that we can write css-like code, run it through a converter, and then get actual css output. Doing this adds an extra step between writing css, and seeing the results in a browser, but there are some excellent benefits.

What are the benefits?

In my view, there are a couple of major benefits to using a preprocessor over writing the raw code: convenience and abstraction.

On the convenience front, simple things like the ability to use variables for colors and fonts is worth the cost of admission on big projects. This is available now without having to wait for the W3C to figure out the final spec and then rely on your target audience to update their browsers to the latest versions.

Example using SASS variables
Example using SASS variables

More powerful though is the ability to deal with an abstraction layer that better enables you to express your intentions instead of the solely functional bits that make browsers do your bidding.

Ugh, still not convinced. Can you give me an example?

Computers run by processing binary code. Only 1s and 0s. Writing anything beyond simple math equations is madness. Quickly humans realized they needed a way to write code more efficiently, thus assembly language is born. Instead of figuring out the exact binary code that a processor needs to run, we could write code that looked slightly less cryptic. Code that captured the intent of the programmer better.

This pattern continues and we get higher level languages as a result. The code gets more abstracted from how it actually runs on the processor, and programmers become more efficient and can express themselves more easily. Another benefit is that compilers are created and most of the time that higher level code can be run on many different processors instead of having to write code specific to each one, like the old days.

Binary wormhole

Writing raw CSS is like writing binary code for styling

15 years ago, the simple act of using a stylesheet in a website was an advanced technique. Windows accounted for nearly all of the browser-based web traffic. Most users had a monitor that had 4:3 aspect ratio so we simply targeted a "safe" width for our sites. We primarily struggled with two very different browsers and got used to hacking our styles in order to get our sites looking decent in both browsers. Folks were only using our site from computers that had a keyboard and mouse. People still used modems to tie up their phone lines in order to connect to the internet. Dev life was simpler.

As new browsers gained favor with the masses, our stylesheets were amended to accomodate the newcomers. Sadly we were so heads-down focused on getting work done that we missed the massive wrong turn we were taking. We just got used to making specific hacks that addressed some issue in some combination of os-browser-resolution-bandwidth-input. We tried so hard to make every single page work for all combinations that we just ended up chasing the wrong paradigm. The paradigm that said we should be able to keep up with the ever-growing matrix of ways users could use our sites; that a good developer should be able to learn every hack for every bug that the collective developer "we" ran across.

It is time to level-up

In order to keep up with expanding requirements, we need to get out of the code weeds and outsource some of our efforts to the machines at our fingertips. Ideally we would use some sort of higher level language or tool across our entire web development chain so that we can gain access to new language features while reducing our efforts. This will also help us bring a great experience to all of our users instead of endlessly shovelling the lowest common denominator at them.

Other ways to use preprocessors

If you have a fairly simple static content site, and you don't have or want a build step to run between modifying your CSS and viewing it in a browser, look into setting up file system watchers that can run the preprocessing step every time to modify a file.

You can also check out the SASS preprocessor tool that I wrote for this post. I wanted something that:

  • could be run from the command line
  • didn't require Ruby to be installed

It is simple and gets the job done. I will add to it later as we start preprocessing more than just CSS.


Post 27: Build-time CSS Validation

View the code changes related to this post on github

One of the nice things about working with compiled programming languages like C# or C++, compared to an interpreted language like Javascript, is the benefit of having the compiler automatically notify the programmer when there are glaring syntactical errors in the code. Similarly, most compilers can also notify the programmer when there are potential errors or suspect code; these notifications are called warnings.

This benefit of having a compiler act as a second set of eyes on the code that you write is great. Little things like gross typos can be caught immediately, as well as more subtle warnings, can be corrected before your code ever gets out into the wild.

That is great...but I work on the web

So what use are these fancy compilers when web pages are made up of markup and script? The nature of web browsers since the very beginning has been to "just figure it out" when it comes to parsing malformed HTML and CSS. Because HTML was largely hand-written in the old days it was often filled with a variety of errors. Browsers were vying to provide the best user experience, so they started trying to parse the "intent" of the HTML in addition to taking it character-for-character. Some examples of subtle typos with potentially major rendering impacts:

  1. <div>Just some <span>simple</div>text here</span>
  2. <ul><li>item 1</li></ul><li>item 2</li>

In example #1, it looks like the DIV and SPAN were accidentally swapped. The effect is that the DIV is closed in the middle of the the SPAN. What should the browser do?

In example #2, it appears the programmer closed that UL too early, or added the second LI later and in the wrong place. Should the browser still show that LI in normal bullet formatting?

These are contrived examples, but they are some of the simplest and illustrate the point that parsing can be tricky. Even the mighty Yahoo with so much focus on browser-related technology has trouble getting it right. Simply running the yahoo.com homepage through the w3c markup validation service exposes 200 errors. Many of these errors are considered "acceptable" by web developers today, but it further proves the point that our human instincts and the strict rules that computer parsers follow often disagree.

Another pair of eyes: CSSLint

In previous posts I've shown how to get some build-time validation on HTML and Javascript, so now I will show you how to use CSSLint to add an automated validation step for CSS.

By default CSSLint is a little too overzealous in my opinion. The rules it applies when parsing your CSS blur the line between coding practices that should be avoided, and political opinions for how CSS should be structured. I won't go into detail about each rule that CSSLint evaluates, but you can take a look at my build script parameters to see which rules I think are completely bogus, as well as which rules are worthy of noting, but not worthy of failing a build over (warnings).

To run CSSLint from the command line or in a batch file, you need to hava Java installed, and download Rhino and css-lint for Rhino. I recommend you download those from my GitHub repository because the latest Rhino build (1.7R4) has a bug that will keep this example from working properly. Here is an example command line that you can run from the build directory (assuming you have Java 7 in the same location that I do):

"C:\Program Files (x86)\Java\jre7\bin\java" -jar rhino.jar csslint-rhino.js --ignore=ids ../../aaronkmurray-blog/css/

What that code does is:

Example CSSLint output shown in a text file
Example CSSLint output shown in a text file

Take a look at my build script for a more detailed example.

The Takeaway

You may have noticed by now that my blog posts cycle between practial web development advice, automated developer tooling, technical implementations, and performance tuning. My goal in this is to provide a well rounded look at the major components necessary in creating software.

It is important for the developer:

  • to understand the software technology they are using (theory)
  • to understand how that software works at the machine level (practice)
  • to leverage to test and prove that their software works (test/reliability)
  • to ensure that software doesn't waste the time of humans (performance)
  • to automate repetitive tasks in order to save time and reduce errors (consistency)

Post 26: Reduce HTTP Requests

View the code changes related to this post on github

The performance lesson in this post gets to the very core of creating sites that feel snappy for the user.


Each time a user hits a web page to view it, the server will typically send back a response in HTML. This initial HTML response almost never contains then complete "page" content as the user thinks of it.

Instead, a subset of content is sent back, along with a lot of other information that tells the browser where to find the rest of the data needed to display the page. This linked data is for things like:

  • Images
  • CSS Files
  • Javascript Files

There can be a lot of these file linked from the HTML. As an example, I hit the Amazon homepage and saw a whopping 220 requests accounting for 1.97MB of data, and 3.31 seconds of load time.

Chrome developer tools view of Amazon homepage requests
Amazon homepage requests as seen in Chrome Developer Tools

Of those 220 requests, the breakdown is as follows:

  • 175 Images
  • 9 CSS Files
  • 11 Javascript Files

Despite a staggering 175 image requests, they are actually using quite a few CSS Sprites. Without these efforts, they'd probably have closer to 500 http requests on the home page.

The Test: Measuring Request Overhead as Experienced by the User

I made 2 simple test pages to compare the effect of http requests. One page has 100 small images included individually, and the other page has 1 CSS Sprite that is used 100 times to show the same 100 images (created using my imgsprite tool).

Here are the request summaries for each test page as see in Chrome:

100 separate image requests
100 separate image requests
1 big CSS sprite for 100 images
1 big CSS sprite for 100 images

In the test you can see that using 1 big sprite for 100 images dropped the request count by 98 (1 image for the sprite, 1 image for the clear.gif vs 100 unique images).

The sprite html size was 26KB compared to 4KB for the page with individual images, due to the extra css required to make the sprites work.

The page with the sprite loaded in 132ms compared to 384ms on the page with individual images. That is a massive difference in time for equivalent content implemented differently.

Reduce HTTP Requests (even at the expense of bandwidth)

Why such a big difference? Well, the simple answers is that each HTTP request has a lot of components that take time. Once the request is agreed upon by the browser and the server, sending the data down to the client can commence at speed.

I want to illustrate this a different way. For example, take this snippet of request timing from my homepage:

Time spent receiving images of various sizes
Time spent receiving images of various sizes

There are a variety of image sizes being downloaded in this snapshot, but the interesting part is comparing the actual time spent downloading the data for images of various sizes. I drew arrows pointing to the largest and smallest image in the snapshot: 83KB vs 3KB. The time spent receiving the 83KB image was 30ms, whereas the time spent receiving the 3KB image was 20ms. You can partly see from that figure that the other images all took a relatively similar amount of time to download.

This further shows the relative impact of the HTTP request overhead and the data delivered. Unless you are serving very large data files (high resolution images, movies, eBook PDFs, etc), a sizeable percentage of request time will be in simply getting the request prepared and negotiated between the two end computers.

For most sites, the takeaway here is that continuous effort should be given to reducing the number of requests that are made, especially as sites grow and get new features added.

And Finally, an Example

In Post 19 I added a little flair to the site in the form of a rotating cube of various avatar images. That was originally done using 6 different images. Here is how I used my imgsprite tool to make a single sprite and shave 5 HTTP requests.

Command line commands for creating and compressing the site logo sprite
Command line commands for creating and compressing the site logo sprite

First, from the build directory (for convenience), create the sprite out of the 6 images ending in "-128.png", and set the relative path for the sprite image based on where it will end up in the web project (paths like this will be better in the future):


Then simply run the image compressor. Since it is in the build directory and the only image in there is the new sprite...no command line args are necessary:


In Closing

Some of you already knew that reducing HTTP requests was beneficial, but now you know what kind of an impact these changes can have.


  • Favor eliminating an http request over reducing filesize (do not be afraid of sprites that have filesizes larger than the sum of each image)
  • Pack images, javascript, and css into sprites/bundles where possible
  • Stay tuned for a post that shows how to bundle files, along with the pitfalls to watch out for when doing so

Post 25: Automated Javascript Unit Tests

View the code changes related to this post on github

First order of business - apologies for such a long time between posts. I did the code changes for this post 5 weeks ago and then got hammered by a 2-week flu, and tons of work getting BuildASign prepared to finish out the holiday season.

In modern websites, more and more of the "logic and code" of a site is being moved to the client. There are 2 primary benefits to doing this:

  • Improve responsiveness for users
  • Reduce load on servers

As this code moves from the servers to the client, we run into greater risk with respect to ensuring that our sites work the way they are intended to.

Unit Testing can be scary and hard
Unit Testing can be scary and hard

Unit Testing

For the server code, unit testing is a common way to execute code automatically to reassure humans that it is behaving properly.

This is much trickier on the client side for a few reasons:

qUnit, Selenium, Test Project ... not 100% reliable

In the past, I've used various means of unit testing my javascript code to make sure that it works as I say it does. My goto combo was using a Visual Studio.NET test project that fired up different versions of the browsers that I wanted to test using Selenium, and then have those instances run though a series of test webpages that used qUnit to run the javascript code. Selenium would check to see if all of the tests ran successfully, and either pass or fail the test.

While this approach worked fairly well from the test project and qUnit perspective, having Selenium reliably control all of the browsers was slightly painful. There is a helpful browser plugin for Firefox that assists in creating the tests, but that ended up being the least of the challenges. Things like Alert and Confirm promts, SSL cert warning pages, and unreliable event triggers made false test failures common enough to not trust the fail results. And the tests took quite a while to run.

PhantomJS, Jasmine, command line ... fast, but limited

A colleague of mine mentioned liking Jasmine and PhantomJS to run javascript tests from the command line. There are pros and cons to this approach, but the combination of running a headless browser (no visual component - just the engine) from the command line meant that many of the "random" behavior that broke my Selenium tests could potential go away.

In practice, the promise of reliable tests using a headless browser seems fulfilled by PhantomJs. The major downside is that we're only testing the Javascript in 1 browser/engine combination (WebKit/JavaScriptCore). Until we get headless browsers that reflect the mainstream browser usage, this approach won't be ideal. We'll still need to use tools (like Telerik's Test Studio and BrowserStack) for robust cross-browser automated testing, but for now on a small project, we can get near-instant feedback on broken javascript tests during our build process, which is much better than nothing.

Setting it up

  1. Download PhantomJS and place it in the build folder
  2. Download Jasmine and place it in the web site test folder
  3. Edit the build script to call PhantomJS, pointing at tests we want to run
  4. Write some tests (aka specs)
  5. Run the build script and it will halt if there are errors

What this means

Now that we have some basic code coverage, editing the javascript code for the site can be done with a little less worry about breaking something else (aka regression errors). This will be much more important as the site starts to break up into many pieces. Most people like the time and fortitude to go back over every piece of functionality ever written after each change. Luckily we have computers that can do that for us...we simply have to tell them how to do it.


Post 24: SEO Part 4/4: Tracking SEO Results

View the code changes related to this post on github

This post is the last post in a 4-part series on SEO written by guest author Shawn Livengood who runs the blog ppcwithoutpity.com. If you haven't already, check out Part 1: Getting Indexed, Part 2: Optimizing Code Tags For SEO, and Part 3: Linkbuilding.

Tracking SEO Results

Now that you know the basics of how to improve your organic search ranking, you'll need a way to track your site's results in the search engine results pages (SERPs). There are many tools that perform this function. Some are free, and some cost money. First, let's cover the free tools.

Google Analytics

You should already be using Google Analytics to track your site traffic. And you should already have created a Google Webmaster Tools account (covered in SEO Post #1). If you have both of these enabled, you can track organic search queries and ranking to your site.

To view your SEO ranking, just log in to your Google Analytics account and go to Traffic Sources > Search Engine Optimization > Queries:

Google Analytics Menu: Search Engine Optimization, Queries
Google Analytics Menu: Search Engine Optimization, Queries

Note: if this is your first time looking at this report, you may be prompted to link your Google Analytics and Google Webmaster tools accounts. Try to do this as soon as you have both accounts so that you can start collecting historical data on your SEO performance.

In this report, you can view specific queries that users typed in to reach your site, the average position on Google that your result appeared, and total clicks, impressions, and click-through rate of the traffic driven by that query in the date range of your report.

Google Analytics SEO Query report example
Google Analytics SEO Query report example

This report is pretty handy, but has two major drawbacks. One, it only shows ranking for searches on Google. You could be ranking well on Bing, Ask, or another search engine and you would never know it. And two, it only shows you stats on keywords you're already ranking for. This could be a problem if you have some specific keywords that you're not currently ranking for, and you need to keep tabs on them.

SEO Book's Rank Checker

This next tool is my favorite free rank checking tool. It's put out by SEO Book, who also makes the super-handy SEO Book toolbar I mentioned in the last SEO post. Rank Checker is a Firefox extension that you can download here: http://tools.seobook.com/firefox/rank-checker/.

Once you install this tool, you can click on the icon in your Firefox browser to get started. You'll get a pop-up screen where you can enter your domain and the keywords you want to track:

SEO Book Rank Checker
SEO Book Rank Checker

Then, hit "Start" to run the analysis. You'll get data on your keyword ranking in whatever search engines you have selected.

SEO Book Rank Checker: Keyword Rankings
SEO Book Rank Checker: Keyword Rankings

Once you have a set of target keywords you want to track, you can click on the "Save" icon next to the domain field to save your keyword set for future reference. Just give it a name, and it will be saved in the application. Then, the next time you want to run that same set of keywords, you can click "Open" and choose your saved list of keywords to run again:

SEO Book Rank Checker: Saved Keyword Groups
SEO Book Rank Checker: Saved Keyword Groups

If you want to keep a running tab on your keyword rankings, just set a reminder to yourself to run your saved set of keywords every week. Export the list as a CSV and enter it into a spreadsheet. This is a pretty reliable way for very small websites to keep track of their SEO ranking with no cost, and just a little bit of effort.

These free tools are pretty good for basic SEO rank tracking, but if you run a larger website with many keywords (and more money at stake), it's probably in your best interest to invest in a more comprehensive paid SEO tool. Here are two great ones that get the job done at a reasonable price.


SEOMoz costs $99 a month and up, but comes with a lot of resources that make it worth it. For rank tracking, they have some simple recurring reports that will deliver weekly rank changes right to your inbox. They also have a whole suite of tools that provide SEO diagnostics, and have perhaps the most active and experienced communities of SEO pros out there. An SEOMoz membership is not only worth it for the tools, but for the community and learning opportunities as well. If you're serious about SEO, and you need powerful tools to get your website ranking, taking a free 30-day SEOMoz trial would be a great next step after you finish these posts. Reading through their forums and blog posts will give you the advanced education you need, and using their diagnostic tools will help you ensure that your site is on the right track.

Raven Tools

Raven Tools is another great paid option that is about $99 a month and has a free 30-day trial. Raven has an extraordinarily good rank tracking tool that allows you to view the historical rank of any keyword by week (after you've added it in to the tool, of course). I haven't seen any other tool at this price level that does that. It's super convenient if you have a lot of keywords you need to monitor.

Besides the rank tracking, Raven allows you to aggregate your link building contacts, Google Analytics data, and Google AdWords data all in one place. Their reporting is top-notch, and their user interface is fairly intuitive. You don't get the wide variety of diagnostic tools or the community of SEOMoz though, so these tools are best used in tandem if you can afford it.

That about covers it for SEO tools and results tracking. There are many, many, other SEO tools out there, but these just happen to be my favorites. If you use all of these tools (or at least most of them), you should have all the tools at your disposal that 90% of SEO folks use. Now, you're armed with the strategy and the tools for SEO success. So go out there, create some great content and great links and get ranking!


Post 23: CSS Resets and Organization

View the code changes related to this post on github

CSS, which stands for Cascading Style Sheets, is something that nearly everyone has heard of, most people have used, and few people understand.

The first experience many people have with CSS is to do something simple, like set the background color on a page. Many folks stop learning once they are done applying colors and borders to html elements.

The next big step for many folks is learning about affecting layout by "floating DIVs" or pursuing a "tableless" layout.

While these are pratical steps in learning about CSS, I think it is important to first call out the intention for CSS. Simply put, that purpose is to help separate the page content from the page visuals.

Inline styles are valid, but bad practice

In HTML, it is perfectly valid to apply inline styles to an element, along with the rest of the attributes for that element, using the style property. Example:

<img style='border:1px solid red;height:100px;' src='...'>

As valid as inline styles are, the problem with them arises as the size and complexity of the site grows. The decision to change colors becomes tedious and error prone when that style is littered across hundreds of elements. The other major problem has to do with selector specificity, because inline styles are very specifc and will almost always override the styles from your css files.

CSS Resets

Search for css reset and you will find a ton of examples, tips, and flame wars surrounding their use. Basically, the problem that css resets are trying to solve is this: Different browsers (and version of those browsers) have different default values for various styles.

These differences in default values lead to many of the headaches that you will experience when trying to get a similar look across many browsers. Shooting for an exact duplicate experience is very tough and likely futile, but using a basic CSS reset and then applying your custom styles after the reset will help save a ton of time debugging subtle differences (especially with margin and padding).

Have a look at the source code for this post to see what I changed. There are comments in the CSS as well as urls to some more useful reads.

I also split up the styles into multiple CSS files. This is bad from the standpoint of http requests, and flies in the face of previous tuning efforts that I've done, but do not fear. Soon I'll show you one way to bundle your files together for performance, and still retain the development benefits of managing separate files over giant monoliths.


Post 22: Tuning Page Load Time

View the code changes related to this post on github

This post has a lot of little tweaks that radically affect the page load time. Some of these are fairly well-known, and some are a little more obscure.

There are a ton of changes so I won't go into extreme detail about every line of code, but I will highlight the particularly interesting bits. Additionally, I wrote a fair bit of javascript util/library code, so I urge you to check out the source code to see what was done. It is commented so that you can follow along easily.

Before diving into the details, let me first state how important it is to have a quick loading site. Each year our patience for slow web experiences dwindles. Slow sites get left in the dust. I'll bounce from a site if it doesn't load in a couple of seconds. The importance of speed cannot be emphasized enough. Amazon equates each tenth of a second (100ms) of speed improvement to a 1% increase in revenue.

Speed affects consumers negatively
Don't be lazy and slow. Site speed affects consumers negatively.


Here is a list of what I did in this post, and below I'll explain the details as well as show before and after stats.

  • Removed 5 external Gist js includes and 1 css include by copying them locally and putting content into akm-gist.js
  • Move Quantcast and ShareThis scripts, css, and images after page load using new utility loader in akm-util.js
  • Split my "akm.blog.init()" handler into two parts to move all non-essential javascript after load event
  • Added defer attribute to all script includes


Gists. The Gists are the snippets of blocks of code that are included in some of the posts. The default way of including them is to put a javascript file reference somewhere on the page and then wait for their server to send the file back down. That file simply contains two lines of code. The first code includes their standard css file for embedded scripts. The second line is a document.write() that just spews out the html. Because the gist js files call document.write they cannot be loaded after page load or the entire markup for the page will get replaced. I simply downloaded each of those js files, along with the css file, and put the contents in my own javascript file (akm-gist.js) so that I could control how the html was rendered. The lesson: just because a site provides you with a javascript include file doesn't mean that you have to use it that way to gain the functionality that it provides.

Script Loader. The Quantcast script was being loaded asynchronously, but because it was inline in the HTML body, it was getting triggered before the page load event. It wouldn't stop the browser from continuing to process the rest of the page, but it still meant that page load happened after it finished loading. ShareThis was a little different because it was being loaded synchronously because there is a call that must be made after the script loads. The script loader function that I wrote takes a callback so that I could specify which code was to be run after the script was loaded.

akm.blog.init() split. When I added the logo cube in Post 19, I added this function to be called during the window onload event. That event gets triggered, and runs any attached functions before it finishes. Since the cube code isn't critical, there is no reason to potentially wait on that code to run. That code was moved into a new function called akm.blog._initDeferred() that will be triggered a quarter second after the init function.

Defer attribute. In the HTML5 spec, there is a new attribute for script tags called async, and it tells the browser not to hold up processing of the html while that javascript file gets downloaded and executed. So the browser will collect these script references and make requests to the various host servers while it continues processing the page. As those scripts come in, they get executed. Sadly, not all browsers support this because it is new. But do not fret! I argue that a better alternative has existed since Internet Explorer 4. The defer attribute has existed since IE4, FF 0.7, and Opera 7. Plus it has a cool distinguishing feature from async because the browser will load the deferred the scripts in the order they are included. This is perfect for pages where you server your javascript files from your own domain and are not concerned about one particular domain holding up the rest of the order.


Before these changes: A test load in Chrome 22 with an empty cache: DOMContent event fired at 700ms, Load event at 1600ms.

After these changes: DOMContent event fired at 185ms, Load event at 208ms!

1.4 seconds quicker to initial load - that is pretty darn snappy. It is worth noting that this page now loads faster than it was with only the first post and a single, scaled-down 1MB screenshot image.


Post 21: SEO Part 3/4: Linkbuilding

View the code changes related to this post on github

As I mentioned in Post 10: The SEO Plan, this post is the third in a 4-part series on SEO written by guest author Shawn Livengood who runs the blog ppcwithoutpity.com. If you haven't already, check out Part 1: Getting Indexed and Part 2: Optimizing Code Tags For SEO.

SEO Part 3: Linkbuilding

Now that we've covered most of the basics of SEO, it's time to talk about the most time consuming, yet most potentially rewarding aspect of SEO: linkbuilding. Linkbuilding is the process of obtaining links from other sites to your site. These links act as a "vote" for your site when search engines are evaluating your credibility. If an external site links to your site using specific anchor text in the link, that sends a signal to search engine crawlers that your site is relevant to the keywords that appear in that anchor text. Repeat the link process dozens, hundreds, or even thousands of times, and your site starts to rank higher in the search engine results for the keyword.

Of course, this process is highly abused precisely because it is so effective. Unethical SEO practitioners use "black hat" tactics like buying links from webmasters, using bots to make thousands of spam comments on blogs with keyword-rich anchor text, or even hacking sites to sneak in a link or two. These tactics work well in the short term, but they're almost always discovered by search engines in a matter of months, leading to a complete de-indexation of the offending site.

But, you don't have to resort to black hat tactics to have success. There are many legitimate ways to obtain links. Here are a few:

  1. Requesting a link from bloggers/webmasters you know personally or professionally.
  2. Adding links back to your site on your social network profiles.
  3. Creating viral videos or infographics that link back to your site.
  4. Offering products or services to bloggers for review purposes.
  5. Creating a useful widget or template that contains a link back to your site, and sharing it with a community.
  6. Posting in forums relevant to the topic of your site.
  7. Submitting your link to Reddit, Digg, Stumbleupon, or similar link-sharing sites.
  8. Writing a guest post for another blog, and including a link back to your site.

Of course, these are just the most current linkbuilding strategies. Ask me again next year, and I'll probably give you another list. The key is to avoid tactics that seem spammy to search engines. If your gut tells you that something is spammy, it probably is. Also, if the link you obtain requires no editorial oversight from a human being, or if you pay an absurdly small amount of money for a high volume of links (I'm looking at you, Fiverr), then the link is probably spammy and will hurt you in the long run. Focus on creating relevant, useful content with a relevant link back to your site and you should be good.

There are a few other factors you should consider when link building:

Dofollow vs. Nofollow

Before you start linkbuilding, it's important to note the difference between a dofollow and a nofollow link. These are attributes in the HTML code of a link that tell search engine spiders whether or not to follow that link and cast a "vote" for the site that's being linked to. Here's an example:

<a href='http://ppcwithoutpity.com' rel='nofollow'>PPC Without Pity</a>

Dofollow means that a search engine spider will go through the link and give credit to the site being linked to. All HTML links are dofollow by default. Nofollow links mean that you'll get an active link, but no real SEO benefit from the link. These days, most blog comments and social media profile links are nofollow to prevent abuse from SEO spammers. It certainly doesn't hurt to have a few nofollow links, though (see the Natural Link Profiles section for more on this). Even if you don't get any SEO benefit, you may obtain some valuable referral traffic.

Keywords In Anchor Text

As I mentioned earlier, having keywords within a link's anchor text affect how the linked-to site ranks for those particular keywords. Basically, if you want to rank for a particular keyword, get some links that contain that keyword in the anchor text. Partial matches of the keyword have an effect as well. For example, if you wanted to rank for the keyword "plaid armadillo," you would probably get some ranking benefit from links that contained the anchor text "armadillo that is plaid" or "plaid baby armadillo".

But, you can overdo this. If you have too many links with the same anchor text, it looks really spammy to the search engines and you may be penalized. As a general rule, you should have more links back to your site that contain your site's name or URL than the quantity of links that contain only keyword anchor text.

Natural Link Profiles

In a perfect world, the best sites for particular topics would automatically get ranked well for the keywords relevant to the site. These sites would naturally obtain links from a variety of sites - blogs, directories, school and library resource pages, citations from other websites, etc. Unfortunately, we live in a world where the people who are most adept at SEO get the best rankings. But still, search engines favor a link profile that looks like our perfect world scenario. If you have too many links from one type of site, too many instances of a particular anchor text, or if you build a lot of links unnaturally fast, you may end up being penalized by the search engines for being a spammer. Instead, pay attention to the kinds of links you build to your site. Build a lot of links with variety. Include both dofollow and nofollow links, make sure you have more brand name/URL anchor text than any one keyword, and don't build all your links from one particular type of site.

Now, let's talk about how to select which sites to approach for links. Not every site is worth your time to obtain a link from. The better a site's quality, the more SEO benefit it will confer on any site it links to. Here's how to tell if a site will provide SEO benefit:

Domain Authority/Page Authority

Domain Authority and Page Authority are two proprietary SEO metrics created by SEOMoz, one of the leading SEO tool providers in the market. These metrics tell you (on a scale of 1 to 100) the "authority" of a domain or individual site page based on the number of links and quality of links pointing to that URL. To see a site's authority metrics, you'll need the SEOMoz toolbar. You can get it here: seomoz.org/seo-toolbar.

Before you get a link from a site, check the page that your link will appear on. If the site has a Domain Authority below 20 or 30 and the Page Authority of the page containing your potential link also falls below that threshold, the link probably won't do much for you. Generally speaking, the higher the Domain Authority and Page Authority of a link source, the more benefit it will confer upon the linked-to site.


PageRank is a similar metric to Domain Authority and Page Authority, but it takes into account on-page SEO factors as well. However, it has two drawbacks. One, it is only valid for Google ranking (PageRank is a Google proprietary metric). And two, PageRank is only updated for sites every couple of months. So if a site has been penalized for bad SEO practices recently, their PageRank might not reflect that.

PageRank is kind of an outdated metric at this point, but it's still good as a general sniff test to see if a site or page will provide SEO benefit. Sites with a PageRank of 2 or higher are generally pretty good candidates for linkbuilding. As with Domain and Page Authority, the higher the PageRank, the better. There are many tools to find a site or page's PageRank. My personal favorite is the SEOBook toolbar. It has a lot of other neat bells and whistles for SEO pros, too.

Cache Dates

If you get a link from a site, but Google never finds the site containing a link, it won't do you any good. that's why it's important to look up a site's latest Google cache date. This metric can be obtained from the SEOBook toolbar mentioned above. If the cache date is more than 30 days in the past, that should be a red flag. Try to avoid getting links from pages like that, or else you're going to be waiting a long time for some SEO benefit.


Post 20: CSS Sprites

View the code changes related to this post on github

Alrighty, another round of performance enhancements. This time we'll go over CSS Sprites.

The concept of CSS Sprites is simple. Instead of downloading 1 real image for each individual image that you see on the screen, we'll actually pack multiple images into 1 file and use CSS to only show the portion of the image that we want to show. This primary benefit is the reduction of http requests which results in faster load times. A secondary benefit is that the browser will use less RAM.

Let's take the little icon images on this blog for example. There are icons in the header and footer for RSS, Twitter, and GitHub. There are 16x16 and 32x32 pixel versions as well. That is 6 icons for a total of 5.73KB after compression. Now let's put them in a simple sprite:

6 site icons in a sprite
6 site icons in a sprite

The new sprite has an 11% smaller filesize (5.11KB). Even better though, we'll save 5 http requests. I also did this for the little post thumbnails. There was 19 of them before this post weighing in at a whopping 192KB. After putting them into a single sprite, the filesize actually got a bit bigger (209KB), but 19 http requests were saved.

I was also curious to see if I could reduce the image quality on the screenshot thumbnails without having them suffer too much visually. The end result is that I determined that an 8bit color depth was nearly as good as 24bit color depth in the thumbnails, and the filesize dropped From 209KB to a tiny 32KB! That is 160KB less than the originals, plus 19 fewer http requests. Win-win. But what does the HTML look like?

Original: <img src='img/blog/icons/icon-github-32.png'>

Sprite: <img src='img/blog/clear.gif' class='img-icon-github-32'>

As you can see, the required change is very minor. First I set the image source to a clear gif, and then I set the class to one that is similar to the image filename.

The CSS is fairly straightforward as well:

    width: 32px;
    height: 32px;
    background-image: url(../../img/blog/sprites/blog-icons-all.png);
    background-position: -32px 0px;

Naturally, I made a command-line tool to make sprites for me because doing them by hand is tedious and error prone. Plus it will generate the css so I don't have to write that either. I also put in the option for limiting the color bit depth to 8 bits. After testing that functionality with ImageMagick, I decided to use a custom quantization alogrithm which resulted in a smaller filesize and a much better looking image.

The next step was to update the build script for this site. Example:


I know that is a gnarly block of command line, so I'll break it down:

  • imgsprite.exe: name of the tool
  • -in:../../aaronkmurray-blog/img/blog/icons/*.png: go grab all of the png icons
  • -img-out:../../aaronkmurray-blog/img/blog/sprites/blog-icons-all.png: put them in a new file named blog-icons-all.png
  • -css-out:../../aaronkmurray-blog/css/sprites/blog-icons-all.css: make a new css file called blog-icons-all.css
  • -css-class-name-prefix:img-: prefix the css class name with "img-"
  • -image-deploy-url-base:../../img/blog/sprites/: base url for the sprite
  • -gen-test-html:true: make a test html page to view all of the sprite images at once
  • -test-html-path:../../aaronkmurray-blog/test/sprites/: this is where the test html page goes
  • -test-html-deploy-url-base:../../img/blog/sprites/: use a special base url for the sprite in the test page because the paths are relative (for now until CDN)

There is one of these calls to imgsprite.exe in the build script for each sprite that I want to create. You can view the test page for the icons and screenshots if you're interested. These allow me to visually do a sanity check on the results, as well as provide me with a nice way of finding the css class name for each sprite.

Before the sprites, a hit to this page had 52 http requests and a payload of 667KB. After the sprites it was 33 http requests and 506KB.

And just for grins, I decided to quantize 5 of the former post images that didn't need to be lossless just to see how much they'd shrink. The result: 248KB originally down to 88KB for a savings of another 160KB. Sweet!

So there you have it. Another tool to add to your collection. CSS Sprites and further image reduction options without breaking a sweat.


Post 19: Fun With A CSS3 Cube

View the code changes related to this post on github

In this post I will show you how I did the "3D" CSS cube for my logo. The code for the changes is on GitHub (as usual).

Site Logo represented as a 3D Cube using CSS3 Transforms

I won't do a full write-up because there is already a nice series of CSS 3D Transforms articles written by desandro. If you are interested, you should read the article for full details on how to do it yourself.

The summary of interesting bits for how I implemented the logo cube is:

  • HTML: Create a wrapper DIV that will represent the "cube"
  • HTML: Create 6 DOM Elements inside the wrapper that will represent the 6 sides/faces of the cube
  • loading gist...
  • CSS: Style the cube for size and 3D transform
  • JS: After the page loads, set up a timer that will rotate the cube every few seconds
  • loading gist...

Fallback when browser doesn't support CSS3 Transforms

Because having a rotating 3D cube for my logo isn't "critical" for this site, I didn't concern myself with having the same experience for all browsers. In this case, browsers that don't support CSS3 Transforms will not show the 3D rotation. They will simply show the "last" image in the stack of "sides", which will result in a single static logo being displayed.

Currently there are two ways of viewing this concept: Graceful Degradation and Progressive Enhancement.

  1. Graceful Degradation means that if a feature isn't supported, and acceptable fallback would occur.
  2. Progressive Enhancement is the opposite way of viewing the same situation. A basic experience is defined, and when possible, enhancements to that experience will be provided based on browser support.

The differences are subtle, but meaningful.

  • Progressive Enhancement implies that all basic functionality will exist, and then extras will be added on to make the experience better. Example: The logo for this site will be a 2D image, unless the browser supports a 3D rendering via CSS.
  • Graceful Degradation typically means that when failure occurs, it doesn't completely kill the experience, but that experience may not be the same. The primary example of this is the use of a text message to the user when Adobe's Flash plugin is not installed/available, instead of providing an alternative way to see that content.

it's up to you to choose when features are "critical" and when they are simply icing on the cake.


Post 18: SEO Part 2/4: Optimizing Code Tags For SEO

View the code changes related to this post on github

As I mentioned in Post 10: The SEO Plan, this post is the second in a 4-part series on SEO written by guest author Shawn Livengood who runs the blog ppcwithoutpity.com. If you haven't already, check out Part 1: Getting Indexed. And after you finish reading this post, have a look at the code changes for this post in GitHub to see Shawn's suggestions at work on this site, as well as some html tag changes to move towards cleaner, semantic markup.

SEO Part 2: Optimizing Code Tags For SEO

it's a common misconception that there are some magical code tweaks that you can make to your site to "do" SEO. Unfortunately, this is not the case. About ten or fifteen years ago, when search engines were just starting to achieve mainstream popularity, there were a lot of secret tweaks you could do to fool search engines into thinking your spam site was the most relevant page for a specific search query. Fortunately for search engine users, search engines have closed many of these loopholes, and the effect of on-site signals to determine keyword relevance have been somewhat diminished.

However, they have not been eliminated entirely. There are still a few code tags that you can enter keywords into in order to influence search engine ranking. Sometimes, if you're targeting a key term with little or no competition, you can even reach a #1 rank through code optimization alone. Let's take a look at some of the code tags that still influence SEO rank.

"META" Tags

<meta> tags appear in the <head> section of each page on your website, and provide metadata to browsers and search engines about what your page is about. There are lots of different <meta> tags, but there are only two that have much bearing on SEO.


The <title> tag defines your page's title, and most browsers render the title as the text in your browser tab that labels the site. it's also the text that appears in the hyperlink on a search engine results page (SERP). In the code, it looks a little something like this:

<title>PPC Without Pity - A PPC Marketing Blog From A Merciless Perspective</title>

Different search engines and browsers have different character limits for the title tag (or at least, limits on what text they display). This limit varies between 65 to 72 characters. So it's best to keep your <title> tag length around 60 characters or less.

You can insert keywords into your title tag. These keywords seem to have a powerful influence on ranking. In fact, optimizing your <title> tag for targeted keywords is probably the easiest thing you can do to your site that will have a distinct SEO impact. But, that doesn't mean you should just cram a mess of keywords into your title tag. Pick one or two phrases you want to target, then add your site's title after that (or if it's your home page, put your site's title first).

<META> Description

The <meta> description tag also appears in your site's <head> section. This tag provides the words that appear below your site's hyperlinked title in the SERPs. Or at least, it does most of the time - if your <meta> description is spammy or nonexistent, the search engines may replace that block of text with a different block of text that appears within your page. A <meta> description tag looks like this in the code:

<meta name="description" content="A PPC blog dedicated to pay per click advertising, Google Adwords, Yahoo Search Marketing, MSN AdCenter, and other pay per click advertising formats and accounts." />

<meta> description doesn't have any direct bearing on your search engine ranking. That particular loophole was closed years ago. But, since the <meta> description appears on the SERPs, it will influence whether or not a user clicks on your site. If you have a really appealing description tag, you may end up getting more clicks than other sites on the results page with an irrelevant description. Treat your <meta> description as ad copy to entice a user to click on your result.

Like the <meta> title, this tag has varying character limits depending on the search engine. But if you keep your <meta> description below 150 characters, you should be good.

<META> Keywords

Another tag that appears in the <head> section, but this one has absolutely no effect on anything. But, it's worth mentioning for that fact. This meta tag was highly abused back in the day, so search engines have disregarded it entirely. If someone is trying to give you SEO advice by telling you to optimize for <meta> keywords, it's probably a good indicator that they don't know what they're talking about.

Heading Tags (H1, H2, H3, etc.)

Heading tags were used more frequently in pre-CSS web development to segment content sections. But now that CSS does that trick, heading tags are somewhat obsolete. Still, there is some evidence that search engines use keywords within H1, H2, and occasionally H3 tags to determine ranking. H1 tags pass the most influence, and H2s pass a little. So if you're using H1 tags, it's a good idea to give some thought to which keywords you use in them.

Image Alt Tags

When you create an image, you have an option to create an "alt" attribute to append a text description to the image. It looks like this:

<img src="http://ppcwithoutpity.com/wp-content/uploads/2012/09/adwords-seller-ratings-example.jpg" alt="adwords seller ratings example" width="224" height="103" class="aligncenter size-full wp-image-943" />

Alt attributes were created to aid in screen-reading programs for blind web users. Having a picture doesn't really help describe anything if you're unable to see it. But guess who else can't see images to tell what they're about? Search engines! Search engine spiders use the text in the alt attribute to determine the topic of your image. If you put some keywords within your alt attribute, then it could help those keywords rank for the page they appear on.

A Final Note On Keyword Content

Including keywords within these code tags is important, but let's not forget the most important place to put your keywords: your content. If you want to rank well for a keyword, it needs to appear within your content, preferably somewhere within the first paragraph or so. But, that doesn't give you a license to just pump your page content full of the same keyword. A good sniff test is to have someone else read your content and ask them if any keyword appears to be repeated too often. You could potentially be penalized for having a spammy site if your site content seems stuffed full of keywords, so you really need to straddle that fine line between using your targeted keywords, but not using them unnaturally.

-Shawn Livengood


Post 17: First Month Retrospective

View the code changes related to this post on github

I'm a big fan of postmortems, and find myself reading lots of them from sites like Gamasutra.com.

The great thing about doing a postmortem is that it helps re-enforce and solidify the learning experiences from a project while they are still fresh.

It has been a month since I started this blog, so I figured that a look back at the project was in order.

What Went Right

  • Lots of posts. I was worried that I wouldn't find enough minutes in each day to construct a decent post.
  • Variety of posts. I've covered various topics, from entry-level HTML and CSS, to build scripts and tools.
  • RSS Feed. Not getting one up quickly was my biggest fear when doing a blog project from scratch.

What Went Wrong

  • RSS Feed. The intial feed had bugs that spammed all posts out as new posts each time I did an update.
  • Not enough pictures. Looking back at the posts, many of them are huge walls of text.
  • Build/Commit/Build process. I still have to do 2 GitHub Commits for each post because each post has a link to it's own commit. Still trying to figure that one out.


As much fun as I am having with this project, there are still many frustrations and things that "feel wrong" every time I do them.

  • Not having a traditional database feels yucky/scary
  • Copying and Pasting my post template with each post feels wrong and is prone to error
  • Not using code that I've already written to achieve things that I want to do feels wasteful
  • Writing everything from scratch feels tedious (yet liberating) at times
  • My Build/Release process still has a couple manual steps


  • Different is okay. I was so used to doing sites a certain way with a wealth of frameworks that I've built up and leveraged over the years, it was scary and foreign to go back to a single HTML page as a starting point. Now I am embracing the process. With each post I challenge myself to achieve the intended end result in a way that I have not done before.
  • Database != DBMS. Thinking about the term "database" without meaning mySQL, SQL Server, or Raven is really odd. Currently, the database for this site is index.html. That is a new paradigm of thinking for me, and it has led to some radical thoughts that I plan on exploring in the future.
  • New process is hard. It actually isn't the process that is difficult as much as challenging my brain to be willing to do things that I've done a dozen times in a new and different way.

What Is Next?

This is the constant question. There are many things that I have listed out in my project notes. Here is a quick sample of things on my short-list:

  • Tech: use a CDN
  • Tech: js and css minification, bundling, versioning, and debugging
  • Tech: css sprites/images
  • Tech: html minification
  • Tech: url-rewriting
  • Tech: automated testing
  • Tech: reporting
  • Tech: figure out what "database" means
  • Tech: automatic seo analytics capture
  • Tech: server-side rendering and client-side MVC
  • Feature: tag cloud
  • Feature: permalinks
  • Feature: post paging
  • Feature: post comments
  • Feature: social integrations
  • Feature: decent UI design

If there are other things that you'd like to see, hit me up on Twitter.


Post 16: HTML Markup Cleanup

View the code changes related to this post on github

This post is a little bit of housekeeping and HTML fundamentals. It will touch on a few of the "smaller" questions that come up related to writing HTML, and then show how to use an automated tool at build time to get a report on the basic structure of our HTML (look for errors, etc.)

A micro-history of HTML

Back in the old days, browsers were fighting to provide the best experience possible as we were discovering the possibilities of the Internet.

At their core, browser are essentially fancy text parsers. They get HTML from a server, and then try to interpret that text and turn it into a nice picture for a human.

It sounds simple enough, but instantly it was troublesome. As it turns out, humans aren't perfect when it comes to creating nested hierarchies of nodes and text. And back then, many sites were hand-made similarly to how this blog is currently being developed. Despite our best efforts, we humans still get it wrong.

For browsers, they needed to not only be great at parsing HTML, but they had to be even better at deriving the author's intention in the midst of the author's own HTML mistakes. This was part of the reason for all of the old nasty doctype declarations (see next section). The goal was to give the browser a hint at what the author intended.


Let's start from the top: doctype.

Doctype is a special tag that basically tells the browser what type of document that it should expect to parse. As such, it has to be the first element of a page.

Here is the modern example: <!doctype html>

There are all sort of different doctypes out there, but the skinny is: if you are making a new site, or you are fortunate enough to no be concerned with supporting ancient browsers, then the doctype example from above is all you need to use. Done. Simple.

If you live in the e-Commerce world, work on intranet apps, or have an affinity for providing support to folks who still use Netscape 4.7, then you have a decision to make.

I won't go into the details here, but it's likely that you'll be using HTML 4.01 Transitional or HTML 4.01 Frameset.

Well, just as authors can make mistakes in the markup, they can also make mistakes when choosing a doctype. Furthermore, most pages on large sites are actually generated on the fly by combining smaller chunks of html into one large page. This makes it especially hard to determine which doctype should be used. The code that chooses the doctype may be aggregating HTML from a source without knowing if that HTML is going to use FRAMESETS. Ugh.

The old intention was noble, but ultimately flawed. So now we just go back to basics, and tell the browser that it should expect HTML.


The first thing you learn when you start researching (old) doctypes is that in addition to HTML, there is an option for XHTML.

To put it simply, XHTML is strictly-written HTML. It doesn't allow for mistakes. It removes the interpretation part from the lenient HTML structure parsing that browsers do.

Again, this was originally intended to combat the wild-west, poorly written HTML that originally dominated the Internet. The moral of this story is: chances are likely that you will never have to know or care about XHTML. Be thankful for that.

Tag and Attribute Names: uppercase vs lowercase

Easy: doesn't matter. UPPER CASE has a way of crying out for attention. Typically when I am reading HTML I am more interested in the tag attribute values as opposed to the tag names and attribute. Because of this, I prefer lowercase. These are functionally equivalent:

  • <!doctype html>

DOM Element Identification: "id" vs "name"

Both id and name are attributes for DOM elements that allow you to identify certain nodes. There are differences, but this guideline will take you a long way: use id instead of name to uniquely identify elements.

When should you use name then?

  • Only on form elements that you want to submit to a server
  • Only on the tags: a, form, iframe, img, map, input, select, textarea
  • Example: <input type='text' id='txtAddress' name='txtAddress' ... />

It is okay/advisable to also use the id attribute whenever using the name attribute. My standard mode of operating is to use the id attribute always, and then additionally use the name attribute on form input fields.

Attribute values: with or without quotes?

Simply put: use quotes if the attribute value has a space in it. If the value comes from a database or other location, then use quotes and make sure to escape any quotes that may appear in that value by using &#39;for single quotes and &quot; for double quotes.


  • Right: <input type='text' value='Aaron&#39;s House' name='txtPartyLocation' ... />
  • Wrong: <input type='text' value='Aaron's House' name='txtPartyLocation' ... />
  • Right: <input type="text" value="The &quot;Good&quot; Son" name="txtNickname" ... />
  • Wrong: <input type="text" value="The "Good" Son" name="txtNickname" ... />

Attribute values: Single quotes vs Double quotes

Short answer: either. Individual style preference. Functionally they are the same. If you are working in a big project that uses double quotes, use double quotes. And vice versa.

Personally, I am conflicted. I prefer single quotes because it "cuts down on the visual noise" when I'm looking at HTML and javascript, but the flipside is that I write a lot of C# code and strings in C# are wrapped in double quotes. These days I find myself using single quotes most of the time.

Tools to help

Above I mentioned that web browsers work by parsing text/HTML and turning that into a visual that humans can understand more easily. There are tools that we can use to pre-parse the HTML and then warn us of the glaring structural errors. For this example, I'll show a tool called html-tidy5 that can be run as part of our build process.

There are also online tools that you can play with to see example results.

I added tidy as the first step in the build script for this site. If it finds errors or warnings when it runs, it will cancel the build and open notepad to show a list of problems. Those can then be fixed, and the build can be run again.

Here is a sample of what html tidy found on this page:

  • Illegal closing tag </span> for <li>
  • A quotation error in my meta name=description tag after refactoring double quotes to single quotes
  • target='_target' instead of target='_blank'
  • missing alt tags on images

I was able to quickly go back and make edits to correct the bugs that I had created. I did have to make a change to the automatic timestamp HTML because tidy flags empty span nodes as warnings. The choice was either to ignore warnings altogether, which would leave me vulnerable to dozens of other warnings that it found, or change my process to stub out a value that was easy to detect. I chose the latter, so now my empty timestamp stubs have a question mark in them, and look like:

  • <span class='post-timestamp'>?</span>

This new process is a simple way to ensure that I maintain a decent quality of my code as the project gets bigger.


Post 15: CSS Includes

View the code changes related to this post on github

Back to basics. I've blogged a couple of times already about the importance of reducing the amount of data that has to be downloaded. Some of you have noticed that up until now, the CSS styles for this blog were still imbedded in the HTML markup.

First, let me explain why the seemingly odd order. There was less that 2KB of CSS in the page, as compared to hundreds of KB in images. I also wanted to have a history right in the index.html for a while so that it was easily apparent to learners looking through the GitHub commit history which CSS changes were causing the visual differences with the first few posts.

In the big picture of site performance, including CSS styles in separate files provides the following benefits:

  • The styles can be used by different pages on the site (code reuse)
  • The CSS files can be cached by the browser so that they do not have to be re-downloaded with each page view
  • The CSS files can be served from a different server in your network, or even a different network entirely (like a global CDN)
  • The CSS files can be compressed, even if your HTML content is not

The downsides including CSS styles in separate files are:

  • Initial (empty cache) page load (slightly) takes longer with multiple requests
  • If external files are served from a different domain/subdomain, then there is also an extra (relatively slow-ish) DNS lookup
  • Browsers need to know when the file was last changed in order to not use outdated/changed files

In practice, the upsides outweigh the downsides considerably. So let's get started!

First, start off by making a new text file. I'll call this blog.css for the sake of simplicity and place it in a folder called css. Then simply add some html to the <head> section let the browser know that it needs to download those styles and use them in the page:

Final note: the one truly notable downside with external file includes has to do with the browsers caching files and dealing with the scenario where a visitor has been to your site before. In that scenario, browser will nearly always try to use a cached version on the file, but may fail to recognize, for various reasons, that the file has been updated/changed and that it should use the latest version from the server instead of the one that it has saved locally. This can cause users to view your site with the old files, and is usually the reason you hear a first web-debugging step of "clear your cache" or "restart your browser/computer."

The best and most reliable way around this is to put simple version numbers in your actual filenames so that the browser always tries to fetch a file that has changed, but that can be cumbersome to maintain. Example:

  • <link rel='stylesheet' href='css/blog-version-123.css'>

Other, less ideal, methods include:

  • Include a querystring paramter after the filename that changes with each version: <link rel='stylesheet' href='css/blog.css?version=123'>
    1. Problem: The file won't be automatically cached for you by external/regionally distributed networks/routers/switches.
    2. Problem: The browser isn't guaranteed to actually fetch the new version
  • Specify a Cache-Control Response header: Cache-Control: max-age=3600, must-revalidate
    1. Problem: You need to have a good estimate of how frequently files change to set a "good" max-age value in seconds (or any of the other directive values)
    2. Problem: The browser doesn't always obey these headers for various technical reasons
  • Specify an ETag Response header: ETag:"1edec-3e3073913b100"
    1. Problem: This value needs to change when the contents of the file change
    2. Problem: The browser doesn't always obey these headers for various technical reasons
  • Specify an Expires Response header: Expires: Thu, 20 Sep 2012 16:00:00 GMT
    1. Problem: You need to have a good estimate of how frequently files change to set a "good" Expires value in GMT date format
    2. Problem: The browser doesn't always obey these headers for various technical reasons

In the worst case scenario of my versioned filename approach, the user will refetch the newest version of the file with each page load. In the worst case scenario of the other methods, the user have the wrong version of the file. In a future post, we'll go over some automated ways of handling this situation.


Post 14: SEO Part 1/4: Getting Indexed

View the code changes related to this post on github

As I mentioned in Post 10: The SEO Plan, this post is the first in a 4-part series on SEO written by guest author Shawn Livengood who runs the blog ppcwithoutpity.com.

SEO Part 1: Getting Indexed

Before you can start seeing spectacular SEO results on your site, first you have to let the search engines know that your site is there. There are a few ways to go about this:

  1. Google Webmaster Tools submission
  2. Bing Webmaster Tools submission
  3. Creating an XML sitemap
  4. Getting a link from an influential, recently-cached site

Let's go through the how-to of each one.

Google Webmaster Tools

  1. Go to google.com/webmasters to create an account.
  2. Once you create an account, click on the "Add A Site" button to add your URL.
  3. After you enter the URL you want to add, you'll be asked to verify that you own the site. You can do this via several different methods: through your domain name provider, uploading an HTML file to your web server, adding a special META tag to your homepage header, or by linking your Google Analytics account. Different sites and hosting configurations have different interactions with each of these verification methods. But, the most reliable (and easiest) method in my experience is the META tag addition.
Google Webmaster Tools: Verify Site
Google Webmaster Tools: Verify Site Ownership Options

Once your site is verified, you will have access to a set of tools that will help you diagnose SEO issues with your site, track inbound links and search queries, and create ways to help Google index your site. I won't get into all the ins and outs of Google Webmaster Tools here (that would take a whole series of posts), but I do want to cover a few settings that will help get your site indexed initially.

Click on the Configuration section in the navigation, and select "Settings." You have a few options on this page. You can select your geographic target here. This will help Google understand what your local language is in, and which international Google search engines should give your site priority. Also on this page, you can choose a preferred domain. You can state that you prefer your domain with or without "www." This will help prevent duplicate content issues by defining one canonical version of your domain name in Google's system. The third option on this page is to select the crawl rate. If you just added your site, you probably won't have the option to change this just yet. But once you get some traffic, you can return to this page to define a suggested crawl frequency for Google's spiders to re-index your site. Of course, this is just a suggestion to Google - there's no guarantee they'll actually follow your instructions.

In Google Webmaster tools, you can also upload an XML sitemap. You can perform this task in the Optimization > Sitemaps section of your account. We'll go into this a little bit more in the sitemaps part of this post.

Bing Webmaster Tools

Bing may not be as popular as Google, but it still gets enough user traffic where it makes sense to have your site indexed by them. Fortunately, they also have a webmaster tools account where you can show Bing how to index your content.

  1. Go to bing.com/toolbox/webmaster
  2. Enter the URL of the site you want to add at the top of the page.
  3. Fill out the form on the next page with your personal information. You can also add a sitemap URL on this form, if you have one.
  4. Once you save your info, your site will appear on your Bing Webmaster Tools dashboard. But, you still need to verify it. Click on the "Verify Now" link next to the site URL.
  5. Bing offers you three verification methods: you can upload a special Bing XML file to your root folder in your hosting account, you can verify via a special META tag on your homepage <head> section, or you can add a unique CNAME record to your DNS.
Bing Webmaster Tools: Verify Site Ownership
Bing Webmaster Tools: Verify Site Ownership Options

Bing Webmaster Tools also has a lot of options to help your site get crawled. You can submit sitemaps, submit individual URLs, and define the crawl rate of your site. All of these options help your new site become more findable by search engine spiders.

Creating an XML Sitemap

As I mentioned above, creating a sitemap is an important part of getting a website crawled by search engine spiders. First, some clarification: just adding a sitemap will not make your site more findable. What a sitemap does is provide a roadmap for crawlers that arrive on your site, helping them find all of the pages within your domain. A crawler has to reach your domain in the first place for a sitemap to help, and uploading that sitemap will not help anything find your domain. But, the sitemap does play an important role in assisting web crawlers with finding all of the obscure, deeply-buried pages within your site. And the more pages on your site that get found, the more pages that have the potential to show up on a web search.

If you have a small (< 500 page) site, you can create a sitemap for free using the tool at xml-sitemaps.com. Just follow the instructions on the page and you should be good to go. If you have a larger site, you may need to run a program on your web server to index and create all the entries on the sitemap. Google has a tool for this (in beta, of course) at this URL: code.google.com/p/googlesitemapgenerator/. There are also dozens of other tools out there to create sitemaps, so finding an easy way to make one is only a Google search away.

Once you have a sitemap, you'll need to upload it to your web server. It must reside at this address: www.your-site-name-here.com/sitemap.xml. If you have to gzip your sitemap due to size, the URL of www.your-site-name-here.com/sitemap.xml.gz is also acceptable. Whichever URL you go with, make sure to enter this URL in your robots.txt file to ensure that the search engines know where it is. And just to be extra sure, submit that sitemap to both Google Webmaster Tools and Bing Webmaster Tools.

Getting An Influential Link

Even after all this work, it may take a while for the search engines to find your site to speed up this process, it helps to get a strong initial link to get the ball rolling. You'll want to get a link from a site that gets cached frequently. If crawlers return to a site frequently to check for new links, the link to your site should be found quickly, meaning that the crawler will reach your site via the link and add it to the search engine's index as soon as the page with your link is cached. Also, you should make sure that the link you get is dofollow - a crawler will not pass through a nofollow link, negating the benefit of indexation.

Getting links isn't exactly easy. But, maybe you have an established site that gets decent search traffic. Or maybe you know a friend who has one. You can even reach out to an influential blogger that you admire and ask them nicely to give you a link to this new project you're working on. Be creative in your linkbuilding, and you will be rewarded.

To check on when a page was last cached by Google, you can use this tool: searchenginegenie.com/tools/google-bot-last-accessed-date.php. Or, the SEOBook toolbar has this functionality within their browser extension. You can download it here: tools.seobook.com/seo-toolbar/. Remember to check the cache date of the page where your link will appear. Homepages tend to get cached pretty frequently, while individual post and category pages do not.

-Shawn Livengood


Post 13: Image Thumbnails

View the code changes related to this post on github

Way back in Post 8 I mentioned 2 ways to reduce the impact of images on bandwidth. The method I tackled then was to automatically compress the png images in the build script. In this post, I will show you how to further reduce the impact by creating and displaying thumbnails instead of scaling down the original image using HTML (<img ... width=100 />).

Request payload before thumbnails
Request payload before using thumbnails (inspected using Firefox, FireBug, and YSlow plugin)

As you can see from the chart, fully 98% of the data that users have to download from the site is for images. Of those 24 images, 12 were fullsize blog post screenshots, which weighed in at a portly 1.17MB. Those 12 little screenshot previews accounted 77% of the size for the entire page - and that is after we compressed the images to reduce about one-third of the filesize.

I made a sample thumbnail by resizing the image down to 100 pixels wide produced a new preview image that was 90% smaller than the original. The prospect of reducing 77% of the entire request payload by 90% got me excited.

Given that I still despise the copy/paste portion of creating new blog posts, and knowing that I don't want to make it harder on myself to release a blog post, I wanted a solution that was 100% automated. There already exists a build script for this site so I knew that I wanted to tie into that step.

loading gist...

Notes on the batch file:

  • The FOR loop gets a list of all of the screenshot files that don't have "thumb" in the name
  • SETLOCAL ENABLEDELAYEDEXPANSION is special inside of loops so that variables can be set with each iteration
  • A new thumbnail file is only created if one does not exist already
  • convert.exe file comes from the free image utility library ImageMagick

The result is that the sum total filesize of the first 12 post thumbnail images is 111KB (a savings of over 1 megabyte). I also removed the width=100 attributes from the previews as they are no longer necessary. Not too shabby for a few lines of code added to the build script.


Post 12: Favicon

View the code changes related to this post on github

Alrighty. Today's post is simple - but something that is very visible to users. The Favicon.

A Favicon is the little icon that appears in the browser tab/address bar.

Favicon browser comparison
Favicons as they are shown in Firefox 14, Chrome 21, and Internet Explorer 9

In 1999, Microsoft introduced Favicons for the purpose of having an icon to display in the Favorites (bookmarks) menu on Internet Explorer. 2 things were done that (nowadays) goes against some web principles:

  1. Use of the Windows .ico file format
    • example: favicon.ico instead of favicon.jpg
  2. Default convention for the file location off the root of the site's domain URL, which meant the location didn't have to be specified in HTML
    • www.aaronkmurray.com/favicon.ico
    • This limits the webmaster's ability to place the file anywhere, or even on a different server, without mapping OS folders or making URL-rewrite rules (we'll cover those later)

As a result, even though you can now specify any location and filetype that you want for your favicon, I still recommend serving an actual favicon.ico from your root for 2 reasons:

  1. Many browsers and RSS readers will still make requests to this location looking for an icon
  2. You'll cut down on the 404 (File Not Found) error noise that will show up in your hit logging

Adding a modern Favicon is simple: <link rel='icon' href='/favicon.png' type='image/png' />

But you should still slap a favicon.ico in your root.

If you don't know how to make a .ico file, you can use a free site like convertico.com to upload an image and get an .ico file back.


Post 11: RSS Fix to Stop Spamming Readers

View the code changes related to this post on github

Bugs! Already there are bugs :)

A colleague of mine mentioned to me that whenever I release a new post, his reader fills shows that all of the posts appear as new posts.

Google Reader showing multiple new posts with each post

This is an interesting problem caused by the fact that I wasn't defining a <guid> element in the RSS feed, nor a corresponding <id> element in the atom feed. These elements are what feed readers (like Google Reader) use to determine if a post/entry is new or not. If the entry doesn't have an ID, it'll always be treated as new. Obviously I need to add these Ids.

So, how should I choose a unique Id for each post? Some say that a TAG Uri should be used. While that seems like a nice way to ensure the creation of a unique id, I don't really want to put a lot of effort into building these Ids by hand (since we're still not using a DB yet). Additionally, I don't care what the Ids are because they are only going to be used by machines, so they don't need to be fancy.

I think this solution calls for a Guid. In fact, RSS feeds explicitly have an element for it. Bingo.

The next step was realizing that I didn't want to have another manual step in the process of releaing a new post. I don't want to generate a guid by hand each time. I also didn't want to store a list of guids that are mapped to blog posts in some external file.

My solution for now while we're still in hand-coding mode is to add a step to rssgen that will search for a guid in each post, and if it doesn't find one, it adds it.

That means that as a write a post, in the post html I have this empty stub:

  • <div class='blog-post-guid'>

I updated rssgen to add the guids inside those stubs, and then re-save the index.html file during the build process. And if I ever want to resend an update out for an old post, I can simply update the guid.

This project is pretty interesting for me so far. These solutions are not the way I normally operate because I typically stand on the shoulders of giants and leverage frameworks that handle many of the details like this. My hope is that the readers learn a few things along the way, though I have a feeling that this project just may radically challenge many of my standard processes and assumptions about web development.

UPDATE September 14, 2012:

Ironically, I had to add more changes to keep from spamming the feed readers. The changes included keeping the date timestamps from changing which would trigger a refetch/display as well.


Post 10: The SEO Plan

View the code changes related to this post on github

As outlined in The Plan, a major goal for this site is to provide an inside view on creating a website from the ground up. Large parts of that inside-out view is a visual history as well as full source code with revision history. While that captures the technical aspects of the site, there are other components that go into making a site.

I hinted at this in Post 4: For the Machines. Another major component for creating a site is the plan to go from simply being "out there on the Internet" to being easy to find from any major search engine. Much of the steps that need to be taken are lumped into the phrase Search Engine Optimization, or SEO for short.

This is probably the most overlooked part of developing a site. Next week I'm going to start a 4-part series on SEO, in collaboration with guest author Shawn Livengood who runs the blog ppcwithoutpity.com.

The prep for that series, it is prudent to start tracking some key SEO metrics as soon as possible. That way we'll be able to see how the changes affect those metrics over time.

I won't go into too much detail know about each of these metrics, but here they are captured for historical purposes:


Post 9: IIS Static File Compression in web.config

View the code changes related to this post on github

Quick post here, while we're on the topic of saving a few bytes. I'm making 2 changes that will save some bandwidth:

  1. Removing the useless Response Header "X-Powered-By" that gets added by IIS
  2. Configure text (html, css) and javascript (js, json) files to be gzip'd automatically when the browser can handle it (ex: Accept-Encoding:gzip)
Request details before web.config changes
Request details before web.config changes (inspected using Chrome Developer Tools)

First of all - the Response Header "X-Powered-By" is a waste of 20 incompressible bytes (because they are in the header and the older HTTP protocol does not support header compression like the SPDY protocol does). From now on, every response sent from webserver will be 20 bytes lighter. Sweet!

Secondly, we want to make sure that text files get compressed before they are sent to the client. Typically you can expect gzip'd files to be fully 2/3rds smaller than their uncompressed bretheren. In the case of this home page (index.html), the payload went from a hearty 39.67KB to a relatively svelte 12.83KB - a 67.7% savings!

Request details after web.config changes
Request details after web.config changes: gzip'd index.html payload is 67.7% smaller

Put this code in your web.config file and enjoy.

loading gist...

Post 8: Automatic Image Compression

View the code changes related to this post on github

Alrighty, let's talk about bandwidth for a moment. Two and a half weeks ago, this site was started as a single HTML page and no external file includes aside from a screenshot. The purpose of the screenshot was to capture a visual change history of the blog so that readers could easily see how the site changed with each post without having to get the code from github at a certain point in time and view the site locally.

These images are saved as PNG files, which is a lossless image format meaning that all of the original image data is still intact. Unlike JPEG files, PNG files won't mess with the fine details of your image in order to make the file size smaller. This is both good and bad.

PNG vs JPEG visual comparison. Source: jonmiles.co.uk
Image comparing PNG (left) vs JPEG (right) detail

The upside is that the screenshots look just like my screen did when I took them. The downside is that the files are bigger than a comparative jpeg file.

So there are two main actions that should be taken here:

  1. Reduce the filesize via compression utilities
  2. Create separate thumbnail images and reference those for the previews

For this post, I decided to tackle the compression issue first, even though creating thumbnails would have a bigger effect, simply because the utility that I wrote is more useful universally.

The utility I just created, called imgsqz (image squeeze for lack of creativity), has the following purpose:

  • Be executed as part of the "Build" process for this site
  • Batch process entire folders full of images to compress them
  • Not waste time try to recompress images that have already been compressed (because compression can take a long time)

I've checked in the tool so that you can look through the source code, but effectively just digs through a folder (and subfolders) looking for PNG files, and then trying to compress them. It keeps track of the files that it has compressed so that subsequent runs are only working on new/changed files. Using it is fairly simple:

imgsqz.exe -s=c:\FolderWithImages

it's now part of the build script for this site, so that all PNG images from now on will be optimized before they hit the Internet for consumption.

Here are the results for some of the files on this site:

FileOriginal SizeNew Size% Saved
img/blog/screenshots/post-1.png103 KB86 KB16.4%
img/blog/screenshots/post-2.png90 KB74 KB17.7%
img/blog/screenshots/post-3.png97 KB68 KB30.2%
img/blog/screenshots/post-4.png88 KB49 KB44.8%
img/blog/screenshots/post-5.png122 KB104 KB13.3%
img/blog/icons/icon-rss-32.png1,659 bytes1,571 bytes5.3%
img/blog/logo/logo-1.png5,025 bytes3,181 bytes36.7%

Each KB of savings is worth a tiny bit of load time and a tiny bit of bandwidth. Over time these savings will add up nicely.


Post 7: Links to GitHub

View the code changes related to this post on github

Quick post here - I just added links to each Post's main commit on github in the post header. Just click on the View the code changes related to this post on github icon to see what was changed.

The purpose is to make it easy to see what changed with each post, but it causes an interesting flow change for "releasing" a post because I need to make a post, and then commit the change, but then edit the post to add the new link to the change on GitHub.


Post 6: Traffic Analytics

View the code changes related to this post on github

Now that we've got a way for visitors to subscribe to the site to get notified when new posts happen, let's start capturing traffic stats using Quantcast.

After you sign up for a free/basic Quantcast account, you can "start quantifying" your traffic by entering your domain name and generating a snippet of html/js that will ping their servers each time someone loads up your page. Simply slap that down at the bottom of your page for now, and we can start getting a rough idea of traffic stats.

That snippet is polite because it does 2 things:

Simple, effective, and good enough for now.

Note: it will take a few days before the stats for this site show up on Quantcast.

loading gist...

Post 5: RSS, Atom, and Build Tools

View the code changes related to this post on github

Alrighty - creating a new blog these days pratically assumes that readers will be provided with an RSS or Atom feed so that readers can "subscribe" and get notified when new posts are made. I don't even follow blogs that lack such a basic service feature.

Ironically, in the interest of starting from absolute square 1, this blog lacked an RSS feed. That was one of the more challenging pieces that I had to get over mentally when considering doing a blog this way. Fortunately, the thought of losing potential followers was ultimately outweighed by the principle of the project.

That said, getting an RSS feed up ASAP was very important to me...even more important than other basics like choosing a database. This leads to an interesting chicken/egg problem however. How will I provide an RSS feed without a database? Am I going to copy/paste even more HTML into an XML file for a RSS feed? What about the Atom feed? Should that be another set of copying/pasting/formatting? Should I just buck up, pick a db, but not mention it until later?

Well, given that this site is a fairly unique project, I'm open to fairly unique solutions as we trod down this path. So what is the solution to the feed problem? Parsing.

Considering my disdain for copying and pasting, and the lack of a database to draw from, I wrote a little parser that would read the HTML from this site and produce RSS and Atom feeds automatically.

Running that tool now becomes the first step in the "build" process for this site. That means that the current process for making and releasing a change to this site is now:

While still far from ideal, this is better than the original process, and it highlights the important phases in the make/build/release process. In upcoming posts, we'll rely more heavily on these automation points and add many steps to them that handle many of the problems that still need to be solved.


Post 4: For the Machines

View the code changes related to this post on github

The purpose of this post is to assist the machines that will be "reading" the site.

There are a couple of simple things that we need to do:

  • Create a robots.txt file for communicating instructions with web crawlers
  • Add <meta> tags to help browsers and search engines

You'll notice that this site's robots.txt file is fairly empty. One interesting note is the line Disallow: /BadBotHoneyPot/

I'll go over that in the future, but for now, I'll just say that we'll use that as a trap to identify "bad" crawlers so that we can automatically block them should we choose.

As for the <meta> tags:

  • <meta charset='utf-8'>
    • This meta tag needs to be near the top of the HTML file (before any text).
    • If it is not, or it is missing, then the browser will just try to figure out the character encoding set on it's own.
    • You will typically only run across the need to include this once you start dealing with localization of your site.
    • More info here
  • <meta http-equiv='X-UA-Compatible' content='IE=edge,chrome=1'>
    • This tag basically does two things.
    • #1: Tell Internet Explorer to use it's most modern mode available (as opposed to IE trying to determine which mode it should run in for best compatibility)
    • #2: Tells browsers with Google Chrome Frame installed to render using GCF instead of their native renderer (Uncommon, mostly to help old slow browsers)
  • <meta name='author' content='Aaron Murray, [email protected]'>
    • Tell the crawler who authored this page
    • Tell any users who view your source how to contact you if needed
    • This is more uncommon for sites that have multiple contributors
  • <meta name='description' content='...'>
    • This is a spot where you can describe your site
    • Some search engines will use this text as preview content
    • Adds some SEO value to your site
  • <meta name='viewport' content='width=device-width'>
    • This tag is largely useful once you start wanting to have your site look nice on small handheld or giant surface devices
    • We'll dive deeper into this in the future, but we've got too many bigger fish to fry at the moment

Post 3: Basic Visual Cleanup

View the code changes related to this post on github

Alrighty, so we have a plan, we've got the code up on github for the world to see, and we have made a couple small changes to make publishing *slightly* less painful via some scripts to automate a couple of steps.

Despite my guts screaming out for functionality, something has to be done about the visuals around here. The look is way too 1994, and not in a cool 1994 sort of way.

I'm going to start attaching a screenshot of the site with each post so that in the future we can easily view the visual progress that is being made. This will be easier than checking out snapshot from the github commit history and running the site locally. In the future this will be harder for folks to do once we get distributed and have various databases powering the content. Fun stuff!

Back to reality - let's spruce this place up. For all of the newbs out there, the best way to fancy up the visuals on a web page is to sprinkle a little CSS love around. Quickly you'll learn about one of the most loved (hated) aspects of web development: browser differences. Luckily there are resources out there to help. For now we won't get into the weeds, but let's just say that for nearly 2 decades we've been struggling with browser differences and there is no end in sight, but at least there is a lot of hope that things will get better over the next decade as older browsers start to die off.

Step 1: very basic UI/UX stuff:

  • Visually split up the site header from the posts
  • Split up the post from each other
  • Format certain types of text (like code)
  • Enable a way to link directly to a specific post
  • Add some content to the footer area of the page

Let's assume that we'll use <div> tags for HTML structure/grouping, and that text should be in either a <span> or a <p> tag.

Using css, we can add some a style for blog posts that have a <span> tag with a css class of <code>. These matching tags will be in a different monospace font to make look them more computery. Easy.

  • <style> .blog-post-body span.code {font-family: monospace, Courier, Lucidatypewriter; } </style>

To enabling linking directly to a post, we're going to use the old-school method of using name attributes and # links. I know this is absurd in the days of permalinks and SEO friendliness, but we haven't written the permalink code yet, so we suffer for now. I'm doing this for your own benefit here folks - it's my PageRank that will suffer. When we fix this later, it'll also be a great time to talk about the joys of refactoring and brown-field upgrades.


Post 2: Deploying New Posts

View the code changes related to this post on github

Already there is pain. Currently, my brand new process when I want to write a new post is:

  • Open up my text editor of choice
  • Copy and paste the HTML for a post from my previous post
  • Edit the old HTML
  • Save and preview the file (testing testing testing)
  • Fix my bugs (remember to edit the timestamp, post tags, etc)
  • Commit the changes to github using TortoiseGit
  • Push the changes to github using TortoiseGit (Manually enter username and password each time in the prompt boxes)
  • Remote into the webserver hosting this site
  • Pull the changes from github to a local folder
  • Copy the files to the IIS folder hosting the site

Ugh. No fun already. I know we can do better than that. First up, let's automate a couple of those steps.

  • Install Git for Windows (how-to)
  • Select the option for "Run Git from the Windows Command Prompt" so that we can write scripts to do our work
  • Make sure that worked (open a Command Prompt, type in "git" and hit Enter
  • Clone the github repo to a local directory. c:\code\git>git clone https://github.com/akmurray/aaronkmurray-blog c:\code\git\aaronkmurray-blog
  • Make a new Windows .bat file that will download latest code and copy to local website folder:
    • cd c:\code\git\aaronkmurray-blog
    • git pull https://github.com/akmurray/aaronkmurray-blog
    • xcopy /Y /E /R /V /I "c:\code\git\aaronkmurray-blog" "C:\inetpub\wwwroot\aaronkmurray"
    • REM pause

Now we can just run that file on the webserver and it'll fetch the latest source from github and push it to the website's folder.

Next up, let's take out the step of entering a username/password each time when doing the git push. Here is an article that describes the simple step of creating a batch file, and then running it, enter your username and password, and it creates the file that TortoiseGit needs so that you don't have to enter those manually anymore.

Far from perfect, but we're taking baby steps here ;)

Article on git fetch vs pull


Post 1: The Plan

View the code changes related to this post on github

This has been a long time coming.

I've been wanting to start blogging for about a year now, but have been struggling with figuring out where to start.

There have been tons of questions swirling around in my head, like:

As I poured through ideas and wrote down reams of notes, one thing was apparent: I was spending a lot of time and effort on the blog but not actually blogging.

Last night I couldn't sleep. The kid's toy keyboard was "cleaned up" in a way that a key was being pressed repeatedly whenever the air conditioning changed the pressure in the play room. I finally got up to address the situation, and then found myself unable to fall back to sleep.

After laying in bed for a couple of hours thinking about the blog, I finally got up to write down my ideas on paper so that I could actually fall asleep. This is common for me: the fear of forgetting an idea keeping me from drifting off into sleepytown.

This morning I get into work and checked on my RSS feed folder for .NET development. it's been a few days since I checked it, but this call-to-action from Scott Hanselman was the final straw. The answer is simply to start blogging now and sort out the details later.

So what's the plan? Well, the plan is:

  • Roll my own site/blog from the ground up
  • Hosted at my house in my server rack
  • Post about all of the changes and more importantly, show the intention behind them
  • Make this site publicly available on github so that everyone can see how it works and see it progress
  • Start with a single index.html file and advance the site it all the way through to a scalable, distributed system, from the ground up

Ultimately, I'd like to think that someone new to the web development craft will be able to use this site, the posts, and the source code history as a guide to seeing what goes on behind the scenes for how to create modern web software. I'm going to start at absolute step 1, and advance steadily through the growth of a real application.

I'd love it if you followed me on this journey. It'd be great if you could sign up for an RSS feed to get auto-notifications when I make new posts, but since this is just an index.html file at the moment, you can't. Looks like we've got a lot of work to do. Get ready for a fun ride.

For now, you can reach me on Twitter @aaronkmurray. Let me know what you think about this project and what features you'd like to see implemented. Here's a sample of what I have planned already:

  • Basics: CSS reset, JS framework, Cloud/CDN hosting
  • Javascript: organizing your JS files (custom and 3rd-party), creating your own framework patterns, debugging
  • CSS: responsive design, image bundling/sprites
  • Build/Deploy: tools for building, bundling, and deploying the site and assets
  • Performance: optimizing response time, payload, client-side caching, server-side caching, web-server and load balancer config
  • Debugging: client-side debugging, server-side debugging, error logging, pro-active notifications, self-healing services, system health dashboards and reports
  • Testing: js unit testing, server-side testing, integration testing
  • Database: choosing a DB, using an ORM (like Entity Framework), writing your own ORM
  • Marketing: strategy, considerations, tracking and mining your own site data
  • Social: integration with the Majors (Facebook, Twitter, Instagram, Pinterest, etc.), API usage, tracking
  • Advanced: client-side MVC, new JS paradigms and protocols for server communication, custom DBMS!

As you can see, that is a reasonably ambitious sampling of goals since we're starting with a single HTML page and no code. Let's get cracking!