PHP Script: Sitemap with the number of backlinks
Based on a CSV file of GWT, and the content of your site, establishes a hierarchical map of the site with the number of backlinks for each page.
The value taken into account is the second value in the table provided in the Google Webmaster Tools : the number of links from different sites.
The table that the script produces as an HTML file, then allows to judge the popularity of different types of pages on your site, according to the number of links they receive.
- In the current version, the script requires a local image of a site, unless you want to run the script on the server, in this case it only requires that the site is static, ie the pages are stored as HTML files.
- Alternatively, a site map in standard XML format.
- The PHP interpreter must be installed.
- Download the list of backlinks from your GWT account in CSV format.
The script itself requires no installation. Once extracted from the archive, to build a map from the files, it starts with the command:
To use a preexisting map, the command is:
But you must create a backcount.ini file to enter information about your site, it must contain two lines, for example:
site=c:/example.com map:sitemap.xml csv=www-example-com_20140530_ExternalLinks_LinkedPages.csv
The first line indicates the root of the storage of the pages. The second the local path of the sitemap. These two commands are alternatives, backcount uses only site and backmap uses map.
The third is the path and name name of the csv file. You may add several csv lines to compare over time.
If you manage multiple sites, you can create an ini file for each.
When the program is started, it prompts you for the name of the ini file. You can then enter the name or directly press Enter if you use the default file backcount.ini.
You can omit the .ini extension, the program will add it for you.
The program then generates an HTML table into a file whose name is made from the first part of the name of the CSV file. Directories of your site are in bold and are followed by the list of pages they contain. On each line on the right is the number of backlinks.
Reading this table will indicate clearly which part of the site is the most popular, and which nobody cares, that much more easily than the original list ranked on the number of backlinks and which do not indicate the pages without backlink .
These may then possibly be de-indexed to improve your site's ranking in search results ... see why in The Panda algorithm made simple.
In the latest version of the script in August 2015, it is no longer necessary to put a list of CSV files in the ini file, provided they are placed in the same directory as the backcount.php script.
Just indicate the common prefix to CSV files in the ini file. For example if your site is www.example.com, the common prefix will be: www-example-com:
Then proceed as in the previous case.
The archive backcount.zip contains the source code in the Scriptol language and the executable code in PHP.
- August 21 2015. New simplified version based on the common prefix. The name of the HTML file generated is now composed of the prefix and the full date
- May 20, 2015. Updated for newer PHP versions.
- November 9, 2014. Links to directories were not counted. This is fixed.
- October 23, 2014. Adapted the source code to Scriptol 2. Multiple CSV files now supported. Display now the total of backlinks in the HTML page and in the console.
- June 2014. First version.
The CSV format is often used to produce lists exchangeable between different software. The script above uses simple functions because it is limited to a defined file, but if you want to use this format in your programs with files from different origins, a specialized tool may be useful :
- CSVfix. To sorts a list alphabetically, search for data, to convert to XML or SQL, to compare two CSV files. A binary executable is available for Windows or Linux.
- Csvkit. Performs the same type of operations. Written in Python.
- OpenRefine. Unlike previous tool that run at command line, has a graphical interface. However, it works poorly on large files .