What's new in this version:
- Important update for Japanese users: - better detection and proper implementation of EUC-JP character encoding
- Implements Page Inspector which opens rather than the Link Inspector where appropriate (eg SEO and Sitemap tables, top level of 'by page' view, or from within Link Inspector if the link url's target is a page.)
- Adds 'detailed diagnostic window for starting url', available under View menu
- Correctly handles response header field 'Refresh:', performing the refresh if the number of seconds is < 30
- Handles http redirect where the redirect url is empty
- Handles http redirect to a mailto: tel: etc
- Adds experimental preference for the Connection: header field (keep-alive or close).
- (Integrity Pro) Adds sitemap visualisation functionality
- Adds number of pages scanned to the main scanning status bar
- Small fix to soft404 check, it had been necessary to leave the 'terms' field in order to save any changes. Now simply typing is enough.
- Minor fix to the linksLimit field
- Fixes to timestamp functionality - now the request date/time shows in the Link Inspector
- If 'bad links only' was selected when application quit, on restart the application would be filtering 'bad links only' but the button would work in reverse, ie show all when depressed. Now fixed.
- Adds ability to fully edit items in rules table (double-click to edit)
- was incorrectly disabling the meta data option if the querystring option was switched on. Now fixed.
- Fixes problem causing some links to be reported with no status, also resulting in "X of Y links checked" where X is less than Y when finished. This could have happened if Integrity first receives a link to a page which is different to its canonical url and happens to try to crawl that page before discovering that canonical url elsewhere.
- fixes data: urls within inline styles from being reported
- (Integrity Pro) adds 'Deep content' to the list of SEO filters (Threshold can be set in Preferences, default is 6 clicks from home.)
- Fixes couple of problems with sitemap rules for setting priority/change frequency
- Fixes a problem which may have caused extra urls to appear in the sitemap / SEO table. This is if internal links on the site redirect to a different URL (which isn't ideal anyway) the original link url was being added to sitemap rather than the final destination url.
- Adds context menu to SEO table view with some useful functions
- switches off some debug messages which would have affected performance of crawl when archive feature switched on
- cleans up some warnings. some data: image srcs were being incorrectly included in the warnings, marked as having no alt text. Also cleans up the formatting, a spurious number was seen following the line number.
- Fixes sorting by status in link URL, By Page and All Links views
- Fixes problem some have experienced with larger sites: "The operation couldn't be completed. (NSPOSIXErrorDomain error 24 - Too many open files)"
- Fixes 'soft 404' check
- Adds an option to soft 404 check, allows you to limit the check to internal pages only (for the best results if set up properly), external only (can produce many false positives and false negatives due to the nature of soft 404s) or both.
- Fixes bug, related to recurring redirect, that could have caused a hang or crash
- Recurring redirects now correctly show as bad links
- Fixes bug that wouldn't have caused a noticeable problem but because it caused external urls to fully load unnecessarily, the crawl may now be noticeably faster
- Defines 'recurring redirect' to be more than 12 redirects. (Greater than 3 already signalled 'redirect chain' in SEO results.)
- Fixes bug which could cause 'running on' after crawl has apparently finished. Or in some cases a crash at that point. (related to meta refresh repeatedly redirecting)
- bug that could cause pages to not be fully scanned under certain circumstances
- 'empty quote' / placeholder flagging
- problems in 'by page' view after clicking header to sort by url
- Fix that prevents a possible (but very unlikely) crash
- Better handling of scan finishing, less chance of appearing to stick near or at the end of the scan
- Handles HTTP Basic authentication protection space (as defined in RFC7617)
- Alters default setting for request header field 'Connection' from 'close' to 'keep-alive'. Examples seen where 'close' causes 'Network connection lost' status.
- Other small improvements
- Fixes crash which may have been experienced when exporting links and choosing 'by link'
- Improves certificate authentication for client-certificate protection space - previously the scan may have appeared to hang at the end because of this
- Corrects an issue with the timeout field, which may have been incorrectly set to a low value when creating a new website config, perhaps resulting in some unexpected timeouts
- Slight change in behaviour. After pausing, views are populated, so partial crawl can be examined. Or work can start on a bit of fixing after a pause/continue.
- 'Stop at X links' and 'Crawl maximum X clicks from home' are moved from Preferences to site-specific settings (Main window, Rules tab). This is also a fix as these settings weren't working properly in v12.0. This is a useful way to limit the crawl if it's not possible to limit a crawl by blacklisting.
- Change to the robots.txt policy. With 'limit crawl based on robots.txt' switched on, if there is a conflict, ie the same url is allowed and disallowed, then 'disallow' overrides.
- Adds an information box which is triggered if the scan stalls at the first url
- If main window is closed, but application remains open, a click on the dock icon re-opens the main window
- Corrects small glitch with the SEO table, preventing full display of the bottom row
- Fixes 'mark as fixed' and 'recheck this url' within link inspector window. Also fixes some possible refreshing issues if those actions used from context menus
- Fixes bug causing "ignore" rules to not be saved
- Adds "File>Return settings to default" menu option. Simple but useful, particularly with the free Integrity (which doesn't allow you to create new/multiple website configs) and particularly useful in support situations (Most support issues have to do with a setting that needs to be changed, or more usually that has been changed that doesn't need to be).
- Version 12 becomes the general release. Phased, beginning with Integrity Pro, web download.
- Fixes the Locate function which appears in the Links context menus (with one instance selected). This is a powerful and useful feature, often overlooked.
- Other small updates related to Cloudflare and Incapsula blocking
- Small but important fix. If a self-closing style tag found on the page (unlikely but valid) parser would ignore the rest of the page.
- Enhancement to starting with a list of links. It's been possible to make a list of urls to different domains in order to scan multiple sites in one scan / one set of results. Now the 'down but not up' rule is applied to urls in that list, so it's possible to selectively crawl sections of a single site.
- (It is also possible to do this by setting up 'whitelist' rules, but this relies on there being links on your starting url to the areas that you want to scan.)
- Note that when using the list of deep links, the trailing slash is important. A url such as peacockmedia.software/mac/scrutiny will be assumed to be a page called scrutiny and the crawl will be limited to /mac/. But a url such as peacockmedia.software/mac/scrutiny/ is assumed to be a directory and the scan will be limited to /scrutiny/ Improvements to the parsing of image srcsets
- Integrity and Integrity Plus were incorrectly showing File > Export > Warnings, which would fail with a warning bell if chosen
- Integrity Pro would fail to export Warnings or Spellings if File > Export was used and the table in question had not been accessed in the UI first
- Some fixes to the config functionality, in particular, deleting the last config in the list, which would cause unexpected behaviour
- Now handles cookies by default. It's becoming more important and the original reason for having it off by default is less of an issue now.
- For each request, the request header field Cache-Control is now set to no-cache (rather than max-age=0) which may be the better way to force a fresh version of the page
- Important update for users of version 10.4.2+, If head tags exist but not body tags (which we believe is fine as both are now optional) Integrity would fail to find links on the page
- Important update for users of 10.4.1. fixes issue with the link inspector, not visiting / highlighting / locating the correct / selected page
- Fixes issue where multiple sections would prevent proper parsing of some of the information in the head and could lead to incorrect warnings of missing title or missing description
- very minor tweaks to the server request header fields which are sent with every request
- if image url is empty, alt text warning says "empty" for image url rather than being blank
- NB Integrity and Scrutiny support pages with multiple head sections (with warning), no head tags, no head or body tags.
- Minor correction with one of the warnings - p within heading - the warning said that p can only contain inline content (which is true) but in this case it should have said heading tags can only contain inline content.
- Fixes problem - multiple instances of the same page may have appeared in the 'Appears on' list in the link inspector, if the anchor features was turned on
- (Pro) Adds ability to see context for warnings. Sometimes it can be difficult to find the problem in a page, even given a line number. A double-click on a warning in the warnings table will open an inspector which will usually show a clip from the page source in the area of the problem.
- (Pro) Adds Headings table to SEO results (headings are still available in the main view)
- Interface changes, for user-friendliness - in Links results, 'By Link' is now 'Link URLs' and 'Flat view' is 'All Links'
- Very minor improvements to content-type / file type detection
- (Pro) A few small fixes to the default values of the SEO results tables
- (Pro) Fixes formatting of data in SEO headings columns (were unnecessarily padded with tabs)
- Now always sends the Accept-Language header with the default value of '*' for all requests
- Adds field to Preferences for user to add a custom value for the Accept-Language header, in order to control which language is selected, for websites which use this header to select which localisation of the website is served.
- Minor improvements to content-type / file type detection
- Improvements to robots.txt detection, parsing and applying
- includes bug fix, 'disallow' terms were being incorrectly applied to externally-hosted resources
- includes bug fix, disallow was working on the whole path, rather than the root, so if /reports/ in disallow list, would apply to domain.com/xxx/reports
- Adds HTML validation warnings "more than one title tag found" and "more than one opening html tag found"
- (Integrity Pro) Improves the html validation where 'render page (run js) is switched on (not recommended unless absolutely necessary). Previously there would have been some false positives (eg 'no doctype' when a doctype is present), some warnings would have been masked, and line numbers would have been inaccurate, because the html would have been parsed after the page render. Now an additional pass for warnings is made over the pre-rendered source.
- Important fix for all users: fixes issue where crawl would stall if the starting url contains a meta refresh without a url, ie just a refresh which isn't a redirection to another url. This might have also prevented pages within the site from being crawled properly if they contained such a meta refresh, but this is likely to have gone unnoticed.
- Reduces the height of the preferences window, was previously too high for some screens
- Fixes false positives reported where an srcset has a hanging comma (which is failed by the w3c validator as "empty image-candidate string")
- The above situation is reported in the warnings. Integrity and Integrity Plus show these in the Link Inspector, Integrity Pro and Scrutiny show html warnings in a table.
Some improvements relating to images:
- Image urls within source srcset=.... were being collected and checked even when 'check images' was switched off
- Images with querystring after the file extension were not being recognised as images under certain circumstances (eg if they had a bad status, or if no mime type is included in the response header)
Some improvements to checking list of links / local files: (web download version. Not supported with MAS version due to Apple's sandboxing requirement):
- Enables files stored in certain locations outside the user directory
- Fixes problem with case sensitivity check when a file location involves a symlink
- Handles certain trackback links, no longer reports them as bad links
- Fixes problem where under unlikely circumstances, spurious character(s) find their way into an image or link url, causing bad link to be reported
- Important fix - link urls within html area map were incorrectly being marked as images, which in Plus and Pro could prevent the page from appearing in the sitemap if the image map is the first occurrence of that url that Integrity discovered.
- Adds 'trust invalid server certificate' (internal domain / subdomains only). Allows scanning of site while certificate is out of date or not yet installed properly.
- (Pro only) Fixes bug which cause pages to be incorrectly reported as having 'robots nofollow'. (occurred when another page which genuinely is robots nofollow links back to the first page).
- Fixes problem with 'flag blacklisted' option. Blacklisted url (ie 'do not check links containing...') were not being flagged properly. The option is now renamed "Treat blacklisted urls as bad links" and with that option switched on, those urls now show up when filtering 'bad links only'.
- Integrity free and Plus released as v10.2.1 for consistent numbering and in order to gain some of the general fixes and enhancements that have been released in Integrity Pro since v10. They do not contain the html validation functionality which is the major part of version 10 of these applications.
- Finds image url in , when either lazyload or look in meta tags is switched on
- As a policy, now reports but doesn't test certain urls such as xmlrpc.php and about:blank. They won't appear in Warnings, but will be listed as "not checked" so that the webmaster can see that they exist on the page. Checking these urls isn't helpful. They may exist for perfectly legitimate reasons such as part of a lazyload system or pingback system.
- Now recognises and warns about unterminated or nested link tags, which are illegal in html. Previously if this problem existed on a page, it could cause some spurious minor symptoms such as a link url being incorrectly reported as an image
- (Plus and Pro)Updates the Paddle licensing framework to the latest version which is Big Sur and M1 compatible
- Extends recent quote escaping enhancement to link exports
- Fixes bug preventing the crawl from starting after a local list of links has been opened
- Adds an efficiency which helps when scanning a very long list of links (csv, xml or txt, thousands of links). Previously it might have appeared that Integrity would hang for some time before the scan started running
- Fixes possible crash which may have happened at any point in a scan for certain websites since v9.10.0
- Small fix with parsing the robots.txt file
- Always parses robots.txt if present and for each url, notes whether the url is allowed or disallowed. If disallowed, a note is made in the url's warnings (warnings are highlighted in orange in the links tables, the actual warnings can be seen in the link inspector)
- Adds 'limit crawl based on robots.txt' setting. (Plus and Pro) Whether disallowed pages are included in the sitemap is decided by Preferences>Sitemap>Observe robots.txt
- (Pro )adds 'Disallowed by robots.txt' choice in filter button in SEO
- (Pro )adds 'Multiple H1' and 'No H1' choices in filter button in SEO, plus they will appear if appropriate in the short summary above the SEO table.
- Fixes problem with 'warnings' filter option in links 'by status' view
- When testing linked files, now automatically ignores the wordpress rest api files which return an unauthorised status when tested, leading to unnecessary concern
- Adds support for charset=GBK, charset=koi8-r, charset=euc-kr and some other Latin and non-Latin character encodings. For certain websites using these encodings, page titles and certain other information may have been garbled before.
- Some improvements around starting your scan with a list of links. In particular, automatically differentiating between txt and csv file types (this fixes a bug where a url containing a comma within a txt file would be incorrectly split).
- Fixes a couple of situations that could result in incorrectly-constructed link urls and therefore false positives
- Better handling of escaped forward slashes in urls
The jump in version number is for consistency with Scrutiny, although many of the changes in Scrutiny 9.8 are Scrutiny-specific (relating to insecure content checks). Integrity benefits from the following changes:
- Adds option to search certain meta tags for urls. Those urls will be link-checked and also checked to see whether they count as insecure / mixed content. The meta tags in question are meta name=, meta itemprop= and meta property=. This includes social media tags such as meta property=og:image
- Very small fix to prevent some false positives arising from SVG masks in style sheets
- (Integrity Pro) Adds 'Manage custom dictionary' button above spell-check table. This tool provides an easy way to see your list of 'learned' words (to check that you haven't 'learned' any misspelled words)and 'unlearn' any that you learned by mistake.
- Diagnosis feature: If debug console verbosity is switched to 'ridiculous', the html received from the starting url is printed to the debug console.
- Fixes blacklist rules table sometimes not clearing when user creates a new website config
- Adds 'Links in' and 'Links out' tables to the page inspector (accessed via the 'target page' tab of the link inspector)
- Fixes relative links being constructed incorrectly where page being parsed is a directory url
- In the warnings tab of the link inspector, if there was a warning about a redirection, it may have contained the final url twice instead of the original url and final url
- Change log not available for this version
- Important release for all users. Eliminates some spurious 'bad links' by correctly ignoring which often doesn't contain a full resource url and can return a bad or unexpected status when tested.
- Adds new columns rel = sponsored and rel = ugc to 'by status' and 'by page' views
- Adds sortable columns to links views and link inspector for rel = sponsored and rel = ugc. These columns are hidden by default but can be shown using the 'columns' selector above each of those views
- With the new 'check anchors' switched on, urls with #anchor fragments were sometimes incorrectly appearing in the Sitemap and SEO tables
- Fixes urls being duplicated in Sitemap table under certain circumstances and settings
- Fixes bug causing redirect to not be reported if the reason for the redirect is only to add or remove a trailing slash, and 'ignore trailing slash' option is switched off
- Very important fix to the new anchor checkbox. If left on and greyed out by switching on the querystring checkbox, could cause infinite loop in the scan.
- Fixes issue with new anchor feature. If an external link contained an anchor and appeared multiple times, each instance was listed separately in the 'by link' view.
- Adds ability to test anchors. You can switch the option on using a new checkbox on Integrity's first tab.
- this will cause urls like /index.html#top and /index.html#bottom to be reported as separate links (resulting in more data) and tested separately. (more cpu and time for crawl)
- If a link url has a #fragment then Integrity will report the server response code as before (coloured red if status is bad). The anchor has no bearing on this. However, if the status is good, then Integrity makes a further check to see whether a name or id can be found on the target page matching the link fragment. If not, this is added to the link's warnings, and the link will be marked orange.
- You can view the details of the warning in the Link Inspector
- Note that the anchor check is case-sensitive. Officially anchors are case-sensitive. Some browsers may treat anchors as case-insensitive, but this doesn't mean that all browsers will and it doesn't mean that it's right.
- Note that you can't 'ignore querystrings' and also test the anchors, since the anchor fragment comes after the querystring.
- The filter button contains a new item 'Warnings' which shows only links with warnings, this will include links with anchors where the anchor (a name or an id) can't be found on the page
- As far as the filter button is concerned, 'Warnings' doesn't include redirects, even though they're both coloured orange in the interface and the Link Inspector Warnings tab does include warnings. The Filter button allows you to separate them
- The filter button option 'Redirects' will still show redirects, even if you've chosen 'do not report redirects' in Preferences.
- Typing a '#' into the search field will show links which contain a #fragment (Plus and Pro only)
- Warnings (which have been reported in the link inspector since v9.0) now cause the link to be coloured orange in the views. As some people like to work towards a clean set of results and may not consider the warnings important, the colouring of warnings can be switched off in Preferences > Links > Warnings. The 'Warnings' filter will still work when colouring of warnings is switched off in Preferences.
- garbage urls caused by a url containing a comma, or a data: image within an srcset
- fixes bug that's unlikely to have been noticed. If a url redirects and the redirect url has a # fragment, traditionally the rule is that those fragments are just trimmed. But they weren't being trimmed for redirect urls. That is now fixed, but of course the new preference to not ignore anchors is respected.
- Irons out problem causing links to be marked external if the case of the domain of a link doesn't match the starting domain. ie start at foo.com, a link to FOO.com would be incorrectly marked as external
- Fixes line number column of 'appears on' table within link inspector window
- Small fix - unquoted link hrefs with no character before the closing bracket weren't being logged properly, leading to some spurious results
- If a meta http-refresh type redirect redirects from an internal url to an external one, then the link was being left marked as an 'internal' link. It's arguable whether this type of link (which redirects from internal url to external) is an internal or external link, but it's important for certain internal processes that it's marked as external when the redirection occurs. This was happening properly for the more usual types of redirect
- Important fix for anyone who needs to export to csv, html or xml sitemap. Fixes crash which may have been experienced on recent versions of the OS after OKing file save dialog
- Better handling of situation where image urls are being checked and an image with alt text is within a regular a href link which also has some link text appearing after the image and within the link. The link is now correctly reported with the link text and the image url is correctly reported with its alt text
- Fixes a bug causing certain links in the above situation to be missed (ie where there is an image beside the link text within a link) and where the new 'lazy load' feature is switched on
- Small improvement to 'lazy loaded' image finder. Now finds video and audio urls in the source tag / data-src element
- Fixes issue that would prevent Integrity from running under certain circumstances, ie on older systems (MacOS10.13 or earlier) and where the server can serve content using Brotli compression
- Integrity users on MacOS 10.13 or earlier should download this update. It shouldn't make any difference for users 10.14 or higher
- The main tables now retain their selection when sorted, as expected
- Support button added to diagnostics window which shows if unexpectedly few results are found
- If 492 codes are encountered (too many requests) more information is given in the Link Inspector's Warnings tab. A 429 may come with a 'retry after' which Scrutiny honours. It may also provide some information in the html of the page which follows the 429 code. All of this information is sent to that link's warnings for the user to see
- Fixes a bug causing bad links to be reported incorrectly when the link contains a fragment (#something) as well as non-ascii characters in the link
- If a mobile user-agent string for a mobile browser is being used, some sites generate an 'intent://' url. Integrity no longer reports 'unsupported url' for such links
- Disables tabbing mode (View > Tab bar) which was causing confusion if accidentally switched on. (Integrity isn't document-based)
- Improvement to 'lazy loaded' image functionality. Adds Blocs to the supported systems
- Adds .webp to the list of recognised image extensions (used in various places within Integrity)
- Adds option to look for 'lazy loaded' image urls. There are various ways to implement lazy loading but Scrutiny should find them in the case of the most common implementations
- If a meta http refresh is within comments (including ) then it's now correctly ignored
- Fixes small bug that was preventing the app from running on Catalina
- Adds 'line number' to link instances (the line number of the link within the html file) - there's now a column to show this number in the 'by link' view (when urls are expanded), by status, links flat view and the table within the link inspector
- Fixes bug that was causing broken images to not be shown in links view when Filter button was set to Images. The same bug may have had other symptoms too relating to broken images (Plus and Pro)
- Fixes possible problem of some repetition in the 'columns' selector of certain tables
- Fixes problem with 'Target Page Inspector' button within Link Inspector window when the Link inspector was opened from certain views
- Fixes bug with subdomain option which could cause certain external links to be incorrectly marked as internal
- fixes links incorrectly reported broken (link is reported with extra text or another url tacked onto the end) when the href isn't terminated by quotes or a space but the end angle bracket
- adds 're-check parent page of url' to context menu in 'links by status' view
- some fixes to the rechecking functionality when called from the By Status view
- Adds detection of unclosed comment tag and unclosed script tag, these things are included in 'Warnings'. In future the number of possible things that you can be warned about will grow
- Adds Warnings into diagnostics window
- Change to the internal flow. Previously link urls were stored 'unencoded' and 're-encoded' for testing (unicode characters and reserved / unsafe ascii characters). This is fine 99.9% of the time but sometimes this can cause a problem when this unencode/re-encode cycle produces a different result form the url as it originally appeared on the page, and the server doesn't respond to the changed version. This can cause Integrity/Scrutiny to report 404 for a link which works on the page.
- Internal note: entities are still unescaped (") we consider that part of the encoding of the html page
- Link text now searched when using search box and by page view
- Redirect chains included in warnings
- Better handling of redirection from a http or https url to a tel:, mailto: etc. Does not create a warning but cancels the connection and sets the status to 'not checked'. The redirect details can be seen within the link inspector.
Redesigned lInk inspector:
- puts redirects on a separate tab rather than a pop-up window
- adds warnings tab, contains details of anything that gives this link an orange 'warning' status
- traditionally the orange 'warning' status meant redirect(s) but now can include a number of other things
- adds 'target page' tab, which shows certain target page properties and a button to access Page inspector
- adds sortable tables of inbound links and outbound links
- adds download time and mime type to page inspector
- Patches bug which could have caused the odd link url to be missed or a spurious link url reported if certain unlikely code appears in the page
- Fixes bug which was causing urls to be reported bad where they were found as the src of certain tags (iFrame, Embed, Script) and were not quoted
- Fixes some unexpected urls appearing in Link views when the search box is used
- Improvement to subdomain comparison, internal links with subdomains may have been considered external if the starting url had a non-www subdomain (This all depends on the 'consider subdomains internal' option switched on)
- Fixes fatal error if option to check linked files is switched on and if a css file doesn't answer UTF-8 encoding
- Adds context menu to table within link inspector. Contains Visit, Highlight, Locate (as per the buttons below, which work if you first select a page within the table)
- Engine now correctly ignores 'data-' elements within link tags. This was leading to some spurious results
- Further improvements to soft '404 functionality'. If target of link returns plain text rather than formatted html, Integrity now handles this. If the target page is formatted html and has a title, this is also now searched for the list of soft 404 terms.
- Further small fix for a potential problem to pattern matching (as used in site search, blacklisting soft 404 etc)
- Fixes a bug causing the crawl to stall under obscure circumstances (starting the scan at a deep url, where the deep url contains an asterisk character)
- Fixes problem of 'soft 404' search returning 'near matches'. It now searches literally for the string(s) you enter
- Corrects odd behaviour when a canonical tag appears twice on a page. This situation is handled more gracefully
- Able to pull image urls from css style sheets and check their status (if the 'check linked js and css files' option is switched on')
- (Integrity Pro) Fixes bug causing some code to appear in stripped plain text if tags have no whitespace between - this could cause spurious words to appear in the spellcheck
- Important fix, a bug could cause crash during scan in certain circumstances (though not reported many times). This was also causing some inefficiency
- Integrity, Integrity Plus and Integrity Pro are now notarized by Apple (security checked and certified). This requires that they run under 'hardened runtime' which is also a security measure
- Search box for link results is now a literal full match
- Subtle improvement to html parsing relating to comments
- Better handling of SSI where the include happens within an html tag
- Some engine improvements re extracting canonical url
- Improvement to subdomain handling. The subdomain option 'treat subdomains of starting url as internal' may have not worked as expected if the starting url had a subdomain already, including www. This option should now work as expected for starting urls that include www
- (Integrity Plus and Pro) Fixes a bug with the sitemap csv export which could cause some unexpected urls in the results (no problem with the xml or other formats)
- Fixes a couple of problems that could cause the scan to speed up above the limit set in Settings : Timeout and Delays
- Change to that Limit Requests to X per minute' setting - it had originally been set to reject anything below 30. That's now reduced to 10 as some sites are getting more difficult to scan with various ways of detecting automated requests
- Fixes bug relating to the blacklist / whitelist rule table, specifically when editing a value, and removes the option for 'Only follow' which was logically flawed and should have been removed when the 'does not contain' option was added. Users should use 'do not follow urls that don't contain' instead
- Improves iFrame support
- Fixes problem with img alt text being truncated if it contains a single quote character
- Important patch, obscure problem causing incomplete scan in unlikely circumstances
- Fixes but that may have caused crash with certain urls
- Further work around the improvement to the meta http-equiv refresh handling
- (Pro and Plus) 8.1.9 was incorrectly sandboxed, possibly resulting in website configurations not being visible for users upgrading to 8.1.9 from an earlier version and then to 8.1.10 Users should contact support for the solution in this case
- 10.14 Mojave dark-mode-ready
- Fixes 'next bad link' button in link inspector
- Fixes a bug which would have caused Integrity to stall at the first url (reporting that as a 200 but going no further) under an unlikely set of circumstances
- 10.14 Mojave dark-mode-ready
- Different handling of a common issue: linkedIn urls returning a 999 code (even though the link may work in a browser). This is not an Integrity issue but common to all webcrawlers / testers. LI seems to detect the rapid requests and/or non-browser querystring and returns a non-standard 999 code. Integrity used to present this as a server error and count it as a bad link. Now it labels it as a warning, and does not count it as a bad link. This is because it is not necessarily a bad link, it just hasn't been possible to test it properly.
- Fixes issue with meta http-refresh not being observed if the page contains content with links. (The content was being parsed for links, in favour of the redirection being observed.)
- (Pro) (Build 8.1.81) Fixes bug causing no data to show when 'duplicate descriptions' is selected in SEO Filter button
- Fixes bug which may have been responsible for some unexpected results for some users
- Enables dark mode when using MacOs 10.14 Mojave (will respect the user's choice of dark or light mode in System Preferences)
- (Pro) Enables keyword density functionality in SEO table (keyword stuffed pages)
- Better handling of a recurring 'Refresh' header field which could have appeared to leave the scan hanging when almost 100% finished
- Some improvements to the sorting and filtering which should prevent a short hang when using the 'bad links only' checkbox in the links results. There may still be a bit of a delay with some large sites and when the 'by status' tab is selected.
- Fixes Problem with 'Images' option in filter button which was showing some urls which weren't images
- Fixes problem with headings / outline in page inspector (accessed from 'by page' view and double-clicking on a page rather than a link)
- Other small fixes
- Fixes problem scanning a site locally and directory path contains a space or certain other characters
- Adds override for the built-in behaviour which excludes pages from the sitemap if they are marked robots noindex or have a canonical pointing to another page. These options are in Preferences > Sitemap, they should be on by default and should only be switched off in rare cases where it really is necessary, such as using the sitemap for a purpose other than submission to search engines (where you do want all internal pages in the file)
- Updates links within the app and dmg (support, EULA etc) to new https equivalents
- Fix to Links/By Link table which was not remembering its column information
- Adds support for tag
- Adds detection of audio and video mime types. The filter button in Integrity Plus and Pro allows you to see audio urls / video urls
- (Pro and Plus) Adds the options to include video in the xml sitemap
- Fixes case where a set of circumstances could cause the scan to appear to finish early (and error shown for first url) while scan actually continues
- (Integrity Pro) Adds some options for spell-checking: to ignore contents of and / , to only check contents of & and to check contents of image alt text
- Note that the option to check spelling within nav, header and footer is off by default
- Fixes Preferences > Links > Do not report redirects
- Further measures to reduce 'false positives' (which is a key v8 feature). In this case, 403 (forbidden), may be returned if useragent string is Googlebot or not a browser. Where a 403 is received, and the user has useragent string set to Googlebot or Scrutiny, then the url is retried once, with cookies, GET method and useragent string set to that of a regular browser
- Doubles the alt text buffer, alt texts of more than 1,000 characters were regularly being seen
- Fixes Preferences > Links > Do not report redirects which has not been working properly in v8
- When user marks a link as fixed, the redirect information for that link is now correctly cleared
- Now correctly handles a link where href = './'
- Allows for longer srcsets (>1000 characters). Previously, truncated urls may have been reported due to a buffer limit
- Fixes sorting in Spelling / by page table
- Adds context menu to sitemap table (copy url / visit url)
- Fixes problem with context menu in SEO / meta data table, 'copy url' or 'visit url' could work on wrong url
- Adds context menu to spelling / by word table (copy url / visit url)
- Adds option to spelling / by word table to 'remove without learning'
- Adds column 'og:locality' to SEO / meta data table
- Fixes bug causing spurious results to appear in the links tables sometimes when using the search box
- (Integrity Pro) enables toolbar 'get info' button for Spelling view
- (Integrity Pro, not MAS) implements update check
- 'Don't follow nofollow links' could prevent crawl from getting off the ground
- Fixes problem in the sorting of Sitemap by 'priority' if any rules are in play
- Fixes bug preventing sitems 'priority' column from being manually edited if the sitemap rules table is empty, and bug preventing the 'change frequency' column from being edited manually
- enables 'double click to preview' in SEO / Images table
- Fixes problem where unlikely set of circs could cause crash (certain unintended spurious character included in the link target url, a specific page encoding)
- Fixes bug that prevented full scanning if port number used in the starting url
- Restores ability to scan a site locally (file://)
- Adds ability to attempt scan Wix site. No option for user, Wix site is autodetected using the generator meta tag
- We don't endorse or encourage the use of Wix, their dependency on ajax breaks accessibility standards and makes them difficult for machines to crawl (ie SEO tools and search engine bots) and impossible for humans to view without the necessary technologies available and enabled in the browser.
- Fixes bug in 'highlighting', if the link occurred more than once on the page, only the first would be highlighted properly
- Fixes minor bug in column selector above certain tables, for French users
- (Integrity Plus) Fixes bug preventing pages from being correctly excluded from sitemap where robots noindex is set in the page head
- (Integrity Plus) Fixes bug causing potential crash if pages are excluded from sitemap for both possible reasons and user presses the button to see the 'more info' button
- Fixes bug in 'highlighting', if the link occurred more than once on the page, only the first would be highlighted properly
- Fixes minor bug in column selector above certain tables, for French users
- Important fix - after scan finishes, depending on certain sequence of events, Sitemap table may have appeared blank. Data should now correctly appear
- Some improvements to the site management (clicking from one website configuration to another). With certain sequences of actions, unexpected results could be seen.
- Other small improvements
Some improvements to 'rules' dialog:
- Rules dialog opens as a sheet attached to the main window, rather than randomly positioned on the screen
- Adds 'urls that contain...' and 'urls that don't contain....' option giving much more flexibility
- (removes 'only follow'. The wording of this became confusing in certain cases (eg if you have more than one of those rules) and it's no longer required because it's the same as 'do not follow urls that don't contain' )
- Important update for French users - when using French localisation, when making a blacklist rule ('Ignore links containing...' etc) the new rule appears not to save when OK pressed
- Fixes problem with finding all frame urls within a frameset
- Adds a trim to the starting url before starting in case whitespace / return characters have been included via a copy and paste.
- Fixes problem with ftp of sitemap file, if Preferences has been set to 'ftp only' and the sitemap generation is attempted before certain other actions.(Integrity Plus only)
- Fixes odd problems with the search field
- Improves the user experience when the [+] (new site) button is pressed for the first time. If a website has already been configured without first creating a new config, the url, settings, rules etc will be saved as the first site before the new one is created (becoming the second site)
- Some fixes and improvements to the 'file size' functionality. And adds option to 'load all images' With this option on, all images are loaded and the size noted. So the 'target size' column of the 'by link' and 'flat views' will show the actual size of the image. With the option off, a size may still be displayed in those columns, but it then relies on the Content-Length field of the server response header, which may be the compressed size of the image or not present. The option slows the scan and uses more data transfer, so only use if you're interested in the size of images on your pages.
- Fixes odd results if a link is an anchor link and contains unicode characters within the anchor
- Fixes links incorrectly found within javascriopt
- Fixes problem causing bad link count to be a little higher than the actual number of bad links. (Caused by certain external urls responding with error butreturning OK when automatically retried, the bad link had already been counted and wasn't reset)
- Important release for users of High Sierra
- Fixes problem that could cause incorrect link text to be reported
- Where appropriate, Integrity uses the HEAD method for efficiency. However, some servers incorrectly return a 404 or 5xx in response to a HEAD request. Such urls are now automatically retried using GET.
- Adds case sensitivity when checking file:// urls there's a new option on the 'Global' tab of Preferences, case sensitivity is on by default.
- Fixes bug which prevented some srcset (2x etc) images from being found
- Increases stability and efficiency under certain circumstances
- Fixes minor problem with the 'delay' functionality (for throttling requests). The bug caused this setting to sometimes not be observed.
- Fixes incorrect handling of base href = single forward slash, now correctly interprets as "relative to the public root"
- Fixes bug causing scan to stall if crawling locally and site is on an external volume
- Enables 'Find' (cmd-F) within debug console
- Keyboard shortcut for main views are changed - shift added (cmd-shift L, T, S, F)
- Fixes bug causing incorrect redirect if a port number is part of the url and the same url redirects multiple times
- Adds debug console with verbosity control
- Adds French localisation to entire app
- Fixes bug causing html pages to not be added to SEO results or Sitemap if it contained no links
- Adds options to ftp dialog (sitemap export) to use TLS, and adds field for port number (defaults to the usual 21)
- Some other small improvements such as validation of the directory field
- fixes issue with links not being found after self-closing script tag in body ()
- fixes issue with
- Improvements to engine, may help with certain sites where timeouts are experienced, maybe randomly or maybe abruptly bringing the scan to a stop. A new 'advanced' preference added to override the new changes and force all connections to stay alive to completion of data load. Not to be used generally (especially where the site contains links to large files) but may help in some situations
- Small change that helps stagger multiple simultaneous requests
- Adds French localization to context help
- Adds support for IDNs - start with either the unicode or encoded version, the unicode version will be displayed, the http requests will be correctly handled using IDNA encoding
- NB - Integrity has long been able to handle unusual characters in the path / filename of a url using encodings such as percent -encoding. This refers to unicode characters in the domain part of the url
- Fixes possible crash on completion of scan under certain circumstances
- Allows generation of a sorted list of images by file size, and which pages they appear on (adds 'target size' column (optional) to the Links 'by link' and 'flat' views)
- Adds 'copy urls' to the context menu where multiple items are selected in all link results tables. (cmd-C also enabled where multiple items are selected). a return-separated list of the selected urls is copied to the clipboard.
- Fixes a crash when using multiple select and 're-check'
A number of fixes around the sitemap functionality, exclusion of pages from the sitemap and canonical URLs:
- Adds a button for viewing pages which have deliberately been excluded from the sitemap. It opens a table showing the URL, canonical URL and the reason that the page has been excluded. The table has context menu for copy URL and visit.
- Where a page has a canonical URL pointing to itself, this page may have been incorrectly excluded from the sitemap in the past if the canonical URL's capitalization is different from the page URL. This match is now checked in a case-insensitive way.
Other small fixes:
- Fixes obscure problem, canonical and other links in the head truncated if url contains /head
- Fixes crash or hang if starting url is a file and that file can't be found, and dock icon is showing progress bar.
- Fixes problem with wrong starting url sometimes being used after File-Open dialog.
- Inherits a fix to the engine, not always recognising an end comment where it looks like -------------->
- Fixes a problem causing Integrity Plus to quit on startup after a certain sequence of events including starting the free Integrity beforehand
- Fixes logical error which meant that if user viewed the 'by status' or 'flat view' while the scan was running, these would not be updated properly at the end of the scan
- Adds Googlebot's user-agent string to the drop-down list of UA strings in Preferences
- Some improvements to the engine, including low disk space detection - offers to stop or continue before space (on the system disk '/' ) becomes critical
- Some fixes to the 'mark as fixed' function - fixes keyboard shortcut (enabled in by link view only), enables multiple selection & mark as fixed in the by link view, and fixes the 'follow up' for that (removing from view if 'bad links only' is in operation)
- Further improvements to the help system
- Unfortunately, OSX components that enhance the help system are available in 10.8 and above. Therefore this version requires minimum 10.8. Users of 10.6 and 10.7 should use Integrity version 6.8.15 and Integrity Plus version 6.8.17
- PeacockMedia's end user licence agreement version v1.2 (published 25 Nov 2016) applies
- Fixes bug causing links to have blank url if the found url contained a particularly unusual percent-encoded character or one that doesn't convert in the claimed encoding
- In case where a page uses the Refresh server response field, and has a large time delay, this could cause Integrity to hang at the end of the scan
- Fixes obscure problem where /head appears within the canonical url, this mistaken as the /head tag, leading to some spurious code appearing in the link results
- Adds multiple selection to by link, by status and by page tables, (these can of course be sorted and filtered in Integrity Plus) and the context menu item 'Re-check selected'. This is a replacement for the old 'Recheck bad links' menu item which was flawed in many ways
- Important fix for anyone scanning locally. Fixes bug present since 6.8.6 which could cause scanning of local files to stall
- Important fix: fixes some spurious non-existent links found when hreflang is present within or tags
- Adds much easier way to select columns for certain tables (flat view and by link) - a menu pulled down from a button just above the table. Similar menu available in export dialog too
- Fixes problem with 'exporting disabled' message appearing even after licence is activated
- Adds 'Depth' as a column in the SEO table (min number of clicks to reach from the home page). This column has already been appearing in the Links tables, but was called 'Distance', now renamed 'Depth' in those tables
- Now makes sure quotes are trimmed from meta refresh url
- Some ../ weren't being correctly resolved if they appeared within the middle of a relative link - improved now
- Adds preference to be tolerant (ie not report a problem) in cases where a ../ travels above the root domain. Although technically an error, browsers tend to tolerate this (assuming the root direcory) so such links will appear to work in a browser
- Small fix to meta refresh redirects
- Adds pattern matching in blacklists / whitelists. * and $ can be used
- Link inspector now remembers the size the user has dragged the previous one to
- links limit in Preferences is capped. Previously, entering a stupidly higher number could cause problems
- Fixes bug causing some spurious data to be included in the link check results, when 'check linked js and css files' is switched on
- Reduces some initial memory allocation - more memory efficient when scanning smaller sites