Joomla: images not indexed? Does Google's crawler display your site incorrectly?

| Gianluca Gabella | Joomla!
It's probably all the fault of your... robots.txt!

Foreword: What is 'robots.txt'?

It is a small text file that Joomla puts in the root of your site (the main folder, where the administrator folder, the modules folder, the templates folder, etc.).

Its function (explained in detail here) is to moderate and manage the crawlers, i.e. those automatic programmes that go around the web and index it... just like Google's famous crawler, which every day goes around thousands of sites and indexes them one by one.

What exactly does it do?

The robots.txt is simply a list of folders (put simplistically) where the crawler cannot enter, so the contents of those folders will not be indexed.

Example, Joomla's robots.txt is this:

# If the Joomla site is installed within a folder such as at
# e.g. the robots.txt file MUST be
# moved to the site root at e.g.
# AND the joomla folder name MUST be prefixed to the disallowed
# path, e.g. the Disallow rule for the /administrator/ folder
# MUST be changed to read Disallow: /joomla/administrator/
# For more information about the robots.txt standard, see:
# For syntax checking, see:

User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /logs/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/

As can be easily guessed, my robots is telling the crawler not to enter the system folders, rightly so.

So? Where's the problem?

The problem is that there are folders in that list that MUST be indexed instead! One of them is the /images/ folder.

This robots.txt file prevents Google from indexing all the images present on your site, which is VERY IMPORTANT!

In fact, if we leave the robots file as it is and search the Internet for an image present in one of our galleries or in one of our articles, we would never find it, because we ourselves have prevented it from ending up in the crawlers' index.

Tragedy!!! How do I solve it!?

The answer, fortunately, is extremely simple:

  1. Go to the root of your site with an FTP programme, such as FileZilla
  2. Download the robots.txt file to your desktop
  3. Open it with any text editor (even notepad)
  4. Delete the line "Disallow: /images/".
  5. Optional: if by chance you have inserted important images directly into the template, you must also delete the line 'Disallow: /templates/'.Salvate il vostro nuovo robots.txt
  6. Send it back to your server again with the FTP programme

Et voila! Our images will now be perfectly indexable by Google

Done! Any other advice before closing this very useful guide?

Yes, in the robots.txt you can also indicate the XML sitemap of your site! And this helps a lot in terms of SEO and indexing.

How can you create a good sitemap for your site? Simple, install the very useful Joomla component OSMap (for veterans, it is a fork of the famous XMap, which is no longer supported): it will automatically create an XML sitemap for you and tell you the web address to put in your robots.txt

OK, insert sitemap too... how can I see if my robots.txt is OK?

Two ways:

  1. Use Google's ROBOTS.TXT tester, which you can find qui.
  2. Use Google's mobile device compatibility test, which you can find here. After the analysis, at the bottom, you will have a 'screenshot' of what the Google crawler sees in your site... if there is only text and no images, it means that your images are not indexed... if it sees the images correctly, it means that you have set your robots.txt correctly.

Robots.txt misconfigured:

Robots.txt configured well:

If you liked this article, please share it!

