Google Can Now Index . . . Flash!
An Interview with Michael Marshall
by Robin Nobles
When I learned that Google can index the contents of Macromedia Flash sites
, I was astonished. It seems that this remarkable discovery had gone virtually unnoticed in the SEO community
. But as you probably know, Google has always been the first to index different types of content: PDF files, .doc files, etc. Google has also made amazing inroads in being able to index dynamic content. And now they're the first major search engine to index Flash - another significant step forward in the SEO industry. So why has Flash presented such problems in the past?
Background of Macromedia Flash and SEO:
With Flash as the main page of a site, the Web site owner is giving up the crucial text necessary to prove to the search engines that the main page is about a particular topic. Without that text, the search engines have nothing to index. Therefore, the main page rarely does well in the rankings, unless off-page factors such as link popularity or link reputation are sufficient to carry the page on their own. In the past, legitimate work arounds have been few and far between. This made things extremely difficult for businesses who wanted to create a rich user experience with a Flash home page, such as Web design firms, photography studios, graphic design firms, and so forth. So, these businesses often sacrificed rankings for the user experience, since they could rarely have both while still following all of the guidelines set forth by the search engines.
Introducing . . . Michael Marshall
When I learned that Google is indexing Flash from Michael Marshall, creator of ThemeMaster and chat/forum moderator for our online search engine marketing courses, and when I learned of the fascinating discoveries he'd made, I immediately wanted to interview him for an article. So let's take a look at what Michael has discovered about Google and Flash.
Michael, how do we know that Google is now indexing the contents of Flash files? Is there a way that we can search the index just for Flash?
Yes. You can enter your search term in Google, and along with that search term, use the file type operator and restrict your search to the file extension “.swf”. This will search for your search term only in Macromedia Flash files. You should see [FLASH] just before each listing in the results page that is a Flash document. For example, put the following in the search box at Google: Best Free Banner Exchange Market" filetype:swf
How can we extract the text found in a Flash file to see what Google sees?
Macromedia has a Flash Search Engine SDK http://www.macromedia.com/software/flash/download/search_engine/
that will give us just what we need. The SDK (Software Development Kit) includes an application named `swf2html'. Swf2html extracts text and links from a Macromedia Flash .SWF file, and returns the data to stdout or as an HTML document. Swf2html is provided as a compiled application and as a static library for linked library implementation. For complete functionality, see the file Readme.htm included in the SDK.
Do you have an example of a Flash file that we can see, as well as an example of the text that the Macromedia tool extracted from the Flash file?
Yes. I have an example of each. If you look at the extracted output in Web page form, you will see that it is not very pretty. Nevertheless, you've got lots of SEO-worthy content there, and that's what we are most concerned with. You should visit the Flash presentation itself, mouse over the text, and click the links in the presentation so you can be familiar with the Flash presentation. You can compare where certain text appears in the Flash presentation and where it is found in the extracted text.
Example of a Google indexed Flash file:
Example of Google extracted text:
(Note: This Flash example is based on one of Michael's own products. However, I chose to use it for two reasons: 1) because of the many different types of Flash involved; and, 2) because it is a text-heavy Flash example, as opposed to many other examples of Flash that I could have chosen to use.
Added Note: Be sure to highlight the entire page of extracted text by clicking on Ctrl A.)
In the output file, you'll notice that some text seems to be repeated on multiple lines and one portion of it even appears invisible since the font color comes out white. This is just a side effect of the conversion/extraction tool and is not really invisible text or spamming. In other words, you're doing nothing wrong when this happens - it's simply due to the tool itself and not spamming or true invisible text.
But how we do we know that's how Google sees it?
A simple test will show us how much of the text in a Flash presentation can be seen (or extracted) by Google. Perform an exact search (and use the file type operator) on some text which appears at the top of the html output from Macromedia's tool, and then perform a similar search for text that appears at the bottom. Try similar searches on text that appears in the middle as well if you really want to be sure. This is a good spot check to see what Google is grabbing from the Flash file. Since we can't know exactly what Google uses to read the Flash file, the Macromedia tool is only an approximation, and this spot check is always the best measure.
How much of your Flash movie does Google see? In other words, how deep into the Flash file does the spider go?
In my experience testing the Macromedia tool, I have found that Google sees all the text that the tool can extract including all links . . . everything from top to bottom.
You mentioned that when certain types of motion in a Flash movie are associated with text, the resulting extracted output will contain duplicated occurrences of that text. Those techies among us will know what that means, but for those non-techies (like me), does this mean that we need to be careful about using certain types of animation, because it could result in duplicate content, therefore creating the possibility of spam or problems with our SEO efforts?
Yes. The type of animation you apply to text in your Flash presentation has an impact on how that text gets extracted. You wouldn't want your keyword density or theme focus to get thrown off by mistake due to applying the wrong type of animation to certain text.
When viewing the source code of the HTML output extracted from your Flash file (see the source code found at the bottom of this page: http://www.internet-marketing-analysts.com/Google-Flash_tutorial/
, there's no title tag. What text does Google pull as the title tag in the search results?
In my experience, I have found that the first line of text in the extracted output gets used by Google as the title tag in the search results. You may want to use swf2html and spot check and modify your Flash presentation until you get the desired result. In addition, the description in the search results is created dynamically (according to the user's query) from snippets of text inside the Flash presentation as extracted by Google.
Do you have any other tips for optimizing Flash files?
Yes. I would recommend that people read my more technical tutorial for more details on optimizing Flash files. (See below) One thing I would add is the problem that might be encountered by Flash presentations which use dynamic content pulled from a database, xml file, etc. based on user input. Such content is not part of the xml file itself and, therefore, will not be indexable by Google.
What about Flash banners? Will Google also index the contents of Flash banners?
Yes. Any Flash presentation, whether full-page or banner size, can be indexed by Google. I have found many instances of both.
For the More Technical SEOs . . .
Michael created a page with a more technical explanation of many of these concepts at the following URL. The page also lists the source code of the HTML output extracted from his Flash presentation.
A Word of Caution . . .
Whatever you do, don't try to hide text in any manner through a Flash presentation. Since it take so much more effort to hide text in a Flash file, doing so would be construed as a more deliberate attempt to deceive a search engine, so it would be a much more serious offense. Remember, hiding anything, whether text, links, etc., is considered spam by Google. Like I tell all of my students, when you go to sleep at night, it's a wonderful feeling to be able to wake up in the morning knowing your pages are right where you left them because you know you've done nothing wrong. Like a very good friend of mine, Ginette Degner, once said, it's much better to be in the rankings for the long haul. Spamming isn't worth it.
Once again, Google comes out ahead with being able to index the contents of a Flash file. This amazing bit of news should make SEOs everywhere extremely happy, since they'll be able to use and optimize Flash files as the main page on a site. Just remember: as with any other SEO strategy, be above board and follow Google's Webmaster Guidelines http://www.google.com/webmasters/guidelines.html
is the Co-Director of Training of Search Engine Workshops with John Alexander. They teach 2-day beginner, 3-day advanced, and 5-day "hands on" search engine marketing workshops in locations across the globe. She also teaches online search engine marketing courses through http://www.onlinewebtraining.com
, and she’s a member of Wordtracker’s official question support team. With partner John Alexander, she's co-authored a series of e-books called, "The Totally Non-Technical Guides to Having a Successful Web Site." And, they opened a networking community for search engine marketers called The Workshop Resource Center for Search Engine Marketers.
is CEO of Internet Marketing Analysts, LLC. He is an artificial intelligence (AI) software developer, Web programmer, certified search engine marketing strategist, and holds degrees in Philosophy, Linguistics and Theology. He is the author of the e-book, “Checkmating the Search Engines” and a contributor to “Building Your Business with Google for Dummies” by Brad Hill (Wiley Publishing).
Additional Commentary from Ralph Paglia:
Although many SEO professionals are now aware that Google does indeed index the text that appears within Flash based web sites, there is still a large number of them advising dealers against using Flash sites. I personally (my opinion only) believe that the root cause of this guidance is the difficulty in managing the text content within web pages built in Flash for most SEO services providers. Usually, an SEO service provider must request that changes in a Flash web page be made by the developers of that site's pages. This can prove problematic in that many site developers either resist making the SEO provider requested changes at all, or they are placed at the bottom of the priority list. This is then compared to the ease of making changes normally deployed within HTML based web site Content Management System (CMS). As a result, despite the fact that Google will index the textual content within Flash based web pages, until the developers of these pages make it easier for SEO service providers to make ongoing periodic changes to the text, I do not see these same providers recommending Flash based sites. Unfortunately, this is not always in the best interest of the dealership... When the use of highly interactive, engaging Flash based web site content results in a dramatic increase in that dealer's conversion rate of visitors into leads and phone calls, and results in longer duration visits and a higher degree of engagement... That is when the convenience factor to SEO service providers is overshadowed by the positive results of increased consumer engagement with the dealership.
Either way, there is no question that Flash based web sites pose unique challenges for both search engines (who want to index their content effectively) and for Search Engine Optimization Service Providers trying to work with them. However, I am one of the few car guys who have had their wishes come true with SEO, and can honestly say "Be careful what you wish for". Having experienced SEO nirvana, I can share with all those reading this blog that having targeted local traffic driven to a highly engaging site that car buyers enjoy using, and which helps people actually BUY CARS is a heck of a lot more financially and emotionally rewarding than an SEO monster that gets 100,000 unique visitors a month from everywhere except the town where you sell cars!
More content added after my friend Shaun Raines wrote to me:
I appreciate the information shared by those that advise against Flash based web sites as if they have some form of web based leprosy, and I agree with most of the advice given against Flash on a technical SEO service provider level… and I was previously aligned with this perspective until about 24 months ago. After my first few months at Courtesy in PHX, I signed up for a Reynolds Web Solutions web site because of how overwhelmingly SEO friendly they really were at that time (Steve Crim is my witness), and then I signed up for a GM PowerShift web site from The Cobalt Group... and then I developed multiple Fresh Start Studio supplied web sites after teaching Dave Jackson about the micro site concept and how to build XML-ADF based forms…
When I first got to Phoenix, I absolutely hated the BZ Sites that I was stuck with because of all I had been taught about Flash based sites while working at Reynolds and Reynolds... Then, I started noticing what was happening with my traffic to each of my various sites. Think about it… Part of what inspired me to start a network of micro sites was my dislike of the Flash based BZ web sites! Well, I soon learned that the GM PowerShift site from Cobalt was so bad it was a total waste of money for me to drive traffic into it. My conversion rates were less than 2% and the time people spent on the site was averaging less than 3 minutes. The HTML based sites that my (former) friend, Dave Jackson built to my specifications, supervision and approval, and based on what I would mock up in Word, were attracting massive amounts of organic traffic, despite our amateurish SEO management of their content.
And that is when the whole Flash based web site thing really started to come into focus…
The HTML micro sites were generating huge traffic, but very few conversions. Then I started noticing that a good chunk of the traffic coming into the BZ hosted Flash based web site at www.Chevrolet-USA.com was being referred into the site by the network of micro sites (see the links at the bottom of each page) and subsequently converting within the Flash site at 10 times the conversion rate of the HTML micro sites. I then invested months and months of time and effort into trying to better optimize the micro sites for improved conversion… Nothing worked as well as when they got to the “sizzle” of the Flash based site after clicking on one of the links in the micro sites, or even from the GM endorsed Cobalt site. This is why SEM service providers generally try to avoid having links to the dealers primary web site within their landing pages, because THEY want the credit for the lead generated from their SEM campaigns… I was not constrained by this thinking, I just wanted to maximize total visitor conversions into phone calls, leads and showroom traffic. Heck, at one time I even convinced Stuart Lloyd, Ray Meyers and Tim Clay to put links on our ClickMotive SEM landing pages, which they later converted into their own inventory displays (see www.ChevyArizona.com).
The www.2008ChevyCamaro.com micro site was the real brain damage learning experience of all… No problem with organic traffic, nothing I had ever done had ever achieved the SEO results that this micro site was producing! It was getting well over 20,000 UV’s month after month after month with no ad spend. The conversions were even OK at over 500 leads a month on average and a peak of 1,800 leads in one month… But, one small problem… No sales! Most of the traffic was organic ( a problem) and from all over the country.
However, when I looked into the Omniture SiteCatalyst metrics software attached to my Flash based primary site from BZ Results, one of the leading referring URL’s for web forms completed (leads) was the Camaro micro site! (surprise, surprise) Then when I tracked it another step closer, I found that these Camaro REFERRALS into the Flash based site were selecting vehicles and completing Quote Requests, Trade-In Value Requests and other forms... And, these were the local buyers, many of whom actually bought a car. The leads that were coming in directly from the Camaro micro site were overwhelmingly from outside the state due to the organic ranking attracting traffic from outside our market.
As time went on, I learned that Organic optimization is nice, but comes with some drawbacks if placed at a higher priority than a site design’s level of engagement with the consumer. At one point I FORBID BZ from doing ANY organic SEO for Courtesy, and my Flash based web sites were still getting indexed. That is when I realized that Flash sites in and of themselves are not necessarily bad, it is when there is a lack of TEXT BASED INFORMATION that can be indexed within a Flash based site, that they are not any good. The bottom line is that Google WANTS to find the right site for whoever is searching for it, and whether it is HTML, Flash or text embedded into images, Google has and will continue to find ways to index that text based information because their REAL customer is the person searching… Rich, robust and interesting information content that is relevant to what people are searching for is weighted as being of greater importance than the methods used to serve it up.
In regards to the above interview posting, it is very old news and I was simply trying to source 3rd party information supporting what I learned way before I ever saw the interview… Today being the first time I ever saw it, after doing a Google search (irony?)
One last thing to consider… More OEM sites are being rebuilt into various Macromedia (Flash) based technologies than ever before… After spending a lot of money on research and consulting services, the OEM’s are realizing that CONSUMER ENGAGEMENT trumps SEO considerations, and may actually enhance SEO when Google generates Quality Scores based on consumer behaviors at the sites after clicking on a search result.
Every time I visit the Googleplex we talk about indexing and quality score… Google indexes more than what most people realize.
Edited and Re-Posted by:
Director - Digital Marketing
OEM & National Accounts ADP Dealer Services
505.301.6369 Cell | Ralph_Paglia@ADP.com