Unredir

Script to replace redirected URLs, in web pages.
Works at a command line on a static site, preferably on a local image to be put online.

Required

Requires PHP 7.
Curl must be enabled in the php.ini configuration file.

This script scans the pages of your website, tests each URL, and when redirected, replaces the URL with the new address.

This is suitable for sites that switch from HTTP to HTTPS, this updates the links, both on the site itself and on all other linked sites.

It also displays broken links and for static sites replaces a link testing tool such as Link Checker on this site.

The code

The program uses the DOMDocument class of PHP to find links in <a> tags or images. But it also uses the file_get_contents function to load the file as plain text.

A routine calls Curl to test if a link is redirected, then to find the final redirect address.

The str_replace function is used to replace redirected URLs (not setAttribute). Then we save the contents with file_put_contents.
Using these alternate functions avoids going through the saveHTMLFile method that tries to reconstruct HTML content before saving the file. Because then tags are added while they can already exist in a php file included.

PHP code to check for redirection:

function redirected($url)
{
   $hcurl=curl_init();
   
   curl_setopt($hcurl, CURLOPT_CONNECTTIMEOUT, 300);
   curl_setopt($hcurl, CURLOPT_RETURNTRANSFER, true);
   curl_setopt($hcurl, CURLOPT_VERBOSE, false);
   curl_setopt($hcurl, CURLOPT_URL, $url);
   curl_setopt($hcurl, CURLOPT_HEADER, true);
   curl_setopt($hcurl, CURLOPT_NOBODY, true);
   curl_setopt($hcurl, CURLOPT_FOLLOWLOCATION, false);
   curl_setopt($hcurl, CURLOPT_SSL_VERIFYPEER, false);
   $headers = curl_exec($hcurl);
   $code = curl_getinfo($hcurl, CURLINFO_HTTP_CODE);

   if($code!=301)
   {
      curl_close($hcurl);
      return "";
   }
   
   curl_setopt($hcurl, CURLOPT_FOLLOWLOCATION, true);
   $headers = curl_exec($hcurl);
   $newurl = curl_getinfo($hcurl, CURLINFO_EFFECTIVE_URL);
   $code = curl_getinfo($hcurl, CURLINFO_HTTP_CODE);

   curl_close($hcurl);
   if($code!=200)
   {
      return "";
   }
   return $newurl;
}

Manual

Open the command line console, go to the directory containing the pages of the site you want to update. Type:

php c:/unredir/unredir.php [options]

Replaces the directory above with the one where you installed unredir.

Two options are possible:

-t : Test the result without changing the files.

-v: Verbose, display all scanned pages.

Download

Versions

See also ...

From HTTP to HTTPS. This script replaces links from http to https for a given domain. It is complementary to it insofar as it also changes the links in the text. But it only takes redirections into account for a specified domain.