Search code examples
ruby-on-railsdreamhostxml-sitemap

Sitemap generators unable to generate sitemap


I have this website https://shopus.pk. I am unable to generate sitemaps using Sitemap generator tools. They just give error like "Error: 422 Unprocessable Entity" or just give me only 1 URL like following:-

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
</urlset>

I understand there is some problem relating to probably security settings of the website or server. But please someone help me identifying the problem. Thanks

BTW my website is being hosted by dreamhost. But I don't think dreamhost has anything to do with this.

I have tried https://www.xml-sitemaps.com/ , http://www.web-site-map.com/, http://www.check-domains.com/sitemap/index.php , https://websiteseochecker.com/html-sitemap-generator/ and many more.

Also I downloaded and tried A1 Sitemap Generator, gnucrawlandmap, SiteMapBuilder, HSEO Sitemap Generator and a few more free sitemap generating tools.

All of the above websites or tools wither give access error or return with just 1 or 2 URLs.

Since my website is built on Ruby on Rails my config file for production environment is below:-

Rails.application.configure do
  # Settings specified here will take precedence over those in config/application.rb.

  # Code is not reloaded between requests.
  config.cache_classes = true

  # Eager load code on boot. This eager loads most of Rails and
  # your application in memory, allowing both threaded web servers
  # and those relying on copy on write to perform better.
  # Rake tasks automatically ignore this option for performance.
  config.eager_load = true

  # Full error reports are disabled and caching is turned on.
   config.consider_all_requests_local       = false
   config.action_controller.perform_caching = true

  # Enable Rack::Cache to put a simple HTTP cache in front of your application
  # Add `rack-cache` to your Gemfile before enabling this.
  # For large-scale production use, consider using a caching reverse proxy like
  # NGINX, varnish or squid.
  # config.action_dispatch.rack_cache = true

  # Disable serving static files from the `/public` folder by default since
  # Apache or NGINX already handles this.
  config.serve_static_files = ENV['RAILS_SERVE_STATIC_FILES'].present?

  # Compress JavaScripts and CSS.
  config.assets.js_compressor = :uglifier
  # config.assets.css_compressor = :sass

  # Do not fallback to assets pipeline if a precompiled asset is missed.
  config.assets.compile = true

  # Asset digests allow you to set far-future HTTP expiration dates on all assets,
  # yet still be able to expire them through the digest params.
  config.assets.digest = true

  # `config.assets.precompile` and `config.assets.version` have moved to config/initializers/assets.rb

  # Specifies the header that your server uses for sending files.
  # config.action_dispatch.x_sendfile_header = 'X-Sendfile' # for Apache
  # config.action_dispatch.x_sendfile_header = 'X-Accel-Redirect' # for NGINX

  # Force all access to the app over SSL, use Strict-Transport-Security, and use secure cookies.
  config.force_ssl = true

  # Use the lowest log level to ensure availability of diagnostic information
  # when problems arise.
  config.log_level = :error

  # Prepend all log lines with the following tags.
  # config.log_tags = [ :subdomain, :uuid ]

  # Use a different logger for distributed setups.
  # config.logger = ActiveSupport::TaggedLogging.new(SyslogLogger.new)

  # Use a different cache store in production.
  # config.cache_store = :mem_cache_store

  # Enable serving of images, stylesheets, and JavaScripts from an asset server.
  # config.action_controller.asset_host = 'http://assets.example.com'

  # Ignore bad email addresses and do not raise email delivery errors.
  # Set this to true and configure the email server for immediate delivery to raise delivery errors.
  # config.action_mailer.raise_delivery_errors = false

  config.action_mailer.default_url_options = { host: "https://shopus.pk" }
  # configure action_mailer
   config.action_mailer.delivery_method = :smtp
   config.action_mailer.smtp_settings = {}
  config.action_mailer.raise_delivery_errors = true
  config.action_mailer.perform_deliveries = true
  config.action_mailer.asset_host = 'https://shopus.pk'
  # Enable locale fallbacks for I18n (makes lookups for any locale fall back to
  # the I18n.default_locale when a translation cannot be found).
  config.i18n.fallbacks = true

   # Send deprecation notices to registered listeners.
   config.active_support.deprecation = :notify

  # Use default logging formatter so that PID and timestamp are not suppressed.
   config.log_formatter = ::Logger::Formatter.new

   # Do not dump schema after migrations.
  config.active_record.dump_schema_after_migration = false
end

And this is how my application controller looks like:-

class ApplicationController < ActionController::Base
  # Prevent CSRF attacks by raising an exception.
  # For APIs, you may want to use :null_session instead.
  protect_from_forgery with: :exception
  include SessionsHelper
  include ApplicationHelper
  private
  # Confirms a logged-in user.
  def logged_in_customer
    unless logged_in?
      store_location
      redirect_to login_url
    end
  end
end

Let me know if you require anything else to solve this issue.


Solution

  • Ok looks like I have figured out the problem. But still not sure about it.

    So after miserably failing with trying almost every site map generator I decided to create my own sitemap generator using Ruby Gems Nokogiri and Mechanize. But to my surprise whenever I would try to extract HTML code from my website the same error would show up "422 Unprocessable Entity". This was the exact error message which I was getting from a few Site map generators.

    I removed "protect_from_forgery with: :exception" from Applications controller and the sitemap generators started working on my website.

    But this wasn't right because "protect_from_forgery with: :exception" should be there. And I have 2 other websites with "protect_from_forgery with: :exception" included in the Application controllers. Sitemap Generators don't show any problem working with these 2 websites.

    The only difference between my first website and the other 2 was that my first website was using ajax and the other 2 were simple. So i finally I figured out that when I remove the format.js line from

    respond_to do |format|
       format.js
       format.html
    end
    

    code block from my index action in the main controller, things would start working. Later I realized that I should have written the respond_to code with format.js below format.html like this

    respond_to do |format|
       format.html
       format.js
    end
    

    After this I changed all respond_to code in every action of all controllers with format.html above format.js

    Everything is working fine now.

    However I am still confused and not sure if my identification of the cause of problem is right? I am still a novice programmer. Also I fail to understand why the order of format.html and format.js matter in this case.

    I am open to all suggestions and a little more insight into the problem.