Search code examples
xmlxml-sitemap

XML sitemap structure for a movie website


I have a website for movies with page structure as follows:

  • movie-1
  • movie-1/comments
  • movie-1/cast

I'm trying to create a dynamic sitemap for pages and it seems there are 2 options to achieve that.

Option 1

The first option is to create a sitemap that includes all movie pages and subpages.

<loc>
    https://www.example.com/movies/movie-1
</loc>
<loc>
    https://www.example.com/movies/movie-1/comments
</loc>
<loc>
    https://www.example.com/movies/movie-1/cast
</loc>
<loc>
    https://www.example.com/movies/movie-2
</loc>
<loc>
    https://www.example.com/movies/movie-2/comments
</loc>
<loc>
    https://www.example.com/movies/movie-2/cast
</loc>

Option 2

The second option is to keep separate each movies, comments and cast sitemaps.

movies.xml:

<loc>
    https://www.example.com/movies/movie-1
</loc>
<loc>
    https://www.example.com/movies/movie-2
</loc>

movie_comments.xml:

<loc>
    https://www.example.com/movies/movie-1/comments
</loc>
<loc>
    https://www.example.com/movies/movie-2/comments
</loc>

movie_cast.xml:

<loc>
    https://www.example.com/movies/movie-1/cast
</loc>
<loc>
    https://www.example.com/movies/movie-2/cast
</loc>

Question 1

In this case, what is the best option and which one should I choose?

Question 2

I'm using external TMDB images in this website. Should I also include images in sitemaps? If the answer is yes, should I include them all in movie related xml files(movies.xml, movie_comments.xml, movie_cast.xml)? Because all movie related pages have the same image. Or is it enough to include them only in the movies.xml file?


Solution

  • Answer 1

    I would go with the one file solution since only search engines are interested in it (must not be human readable) and it is just one file to care about.

    Answer 2

    According to https://www.sitemaps.org/protocol.html#xmlTagDefinitions there is way to specify additional media in the protocol. You can only link thinks from your domain anyway (https://www.sitemaps.org/protocol.html#location).

    https://www.sitemaps.org/ is in my opinion a good resource for this topic.