Search code examples
pythonsitemap.xmlcustom-error-pages

How to create in sitemap.xml?


I am not able create a sitemap with the following code?

from usp.tree import sitemap_tree_for_homepage

tree = sitemap_tree_for_homepage('')
print(tree)

for page in tree.all_pages():
    print(page)
    

Solution

  • The sitemap layout looks like this:

    <?xml version="1.0" encoding="UTF-8"?>
    
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    
       <url>
    
          <loc>http://www.example.com/</loc>
    
          <lastmod>2005-01-01</lastmod>
    
          <changefreq>monthly</changefreq>
    
          <priority>0.8</priority>
    
       </url>
    
    </urlset> 
    

    In this thread you can read how to create a xml file:

    from usp.tree import sitemap_tree_for_homepage
    import xml.etree.cElementTree as ET
    import simplejson as json
    
    tree = sitemap_tree_for_homepage('https://www.nytimes.com/')
    
    root = ET.Element("urlset", xmlns="http://www.sitemaps.org/schemas/sitemap/0.9")
    
    for page in tree.all_pages():
        url = page.url
        prio = json.dumps(page.priority, use_decimal=True)
        # format YYYY-MM-DDThh:mmTZD see: https://www.w3.org/TR/NOTE-datetime
        lm = page.last_modified.strftime("%Y-%m-%dT%H:%M%z")
        cf = page.change_frequency.value
        urlel = ET.SubElement(root, "url")
        ET.SubElement(urlel, "loc").text = url
        ET.SubElement(urlel, "lastmod").text = lm
        ET.SubElement(urlel, "changefreq").text = cf
        ET.SubElement(urlel, "priority").text = prio
    
    ET.indent(root, "  ") # pretty print
    xmltree = ET.ElementTree(root)
    xmltree.write("sitemap.xml", encoding="utf-8", xml_declaration=True )
        
    

    If you want the lastmod to be todays date. Import date from datetime.

    from datetime import date
    

    and replace

    page.last_modified.strftime("%Y-%m-%dT%H:%M%z")
    

    with

    date.today().strftime("%Y-%m-%dT%H:%M%z")
    

    sitemap.xml

    <?xml version='1.0' encoding='utf-8'?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      <url>
        <loc>https://www.example.com/</loc>
        <lastmod>2022-07-19T15:24+0000</lastmod>
        <changefreq>daily</changefreq>
        <priority>0.8</priority>
      </url>
      <url>
        <loc>https://www.example.com/about</loc>
        <lastmod>2022-07-19T15:24+0000</lastmod>
        <changefreq>daily</changefreq>
        <priority>0.8</priority>
      </url>
    
    </urlset>
    

    If you use https://www.example.com/ as your url you will not get the ouput above. Because example.com does not have a sitemap.xml. So use a different url.