10 months ago 1 commits to master since this release
Use requests to fetch web pages
1 year ago 2 commits to master since this release
Add option for crawl rate limitation
1 year ago 4 commits to master since this release
Allow compressed transport of content
1 year ago 5 commits to master since this release
Use user agent ‘sitemap_gen’ for checking robots.txt
2 years ago 7 commits to master since this release
Fixed getting ‘Last-Modified’ date
2 years ago 8 commits to master since this release
Port to Python 3
Support robots.txt with wildcards
11 years ago 13 commits to master since this release
Added support for the ‘nofollow’ tag and for robots.txt.
12 years ago 14 commits to master since this release
Added handling of HTTP redirects by Pavel “ShadoW” Dvořák
12 years ago 15 commits to master since this release
Added handling of BASE HREF tag.
13 years ago 16 commits to master since this release
Added missing XML entity escaping.
13 years ago 17 commits to master since this release
Download from http://toncar.cz/opensource/sitemap_gen.html