# Automating Indexing

# Quick Start Guides

Tutorial: Core Index - Manual & Cron job (opens new window)

Even though XTDir makes it very easy to index SobiPro sections in your Joomla!™ site, by default on Page Load mode, it still requires visits in to the site's frontend; or it requires you to login, click on the Index button to complete the Indexing operation.

Our job is to automate your life, making repeated and time consuming procedures a breeze. To this end we offer different indexing automation possibilities for XTDir:

  1. Page Load mode
  2. Cron Job mode
  3. Manually (click on the Index button)

Cron Job mode is only available in XTDir for SobiPro

# Overview of CRON jobs

To run smoothly, you will need to set up a CRON job to execute periodically. By unloading this processing task to a scheduled job, the extension will be able to run faster and in a predictable way.

TIP

A cron job task can be executed every minute. XTDir only performs a full index update when there is new information.

The CRON jobs recommended to:

  • Avoid excessive processing during a normal page load
  • Process new and updated entries
  • Regenerate XTDir's Tree Index
  • Update Promoted Entries positions
  • Update Primary and Second orders

There are two ways to execute the CRON job:

  • Web CRON job script
  • Command Line Interface (CLI) - Native CRON script (recommended)

# Front-end processing, for use with CRON

The front-end processing feature is intended to provide the capability to perform an unattended, scheduled indexing SobiPro sections on your site.

The front-end indexing URL performs a single indexing step. You will only see a message upon completion, should it be successful or not. There are a few limitations, though:

It is not designed to be run from a web browser, but from an unattended CRON script, utilizing wget or CRON as a means of accessing the function.

The script is not capable of showing progress messages.

Before beginning to use this feature, you must set up XTDir to support the front-end indexing option. First, go to XTDir's main page and click on the Component Configuration / * Core Index of SobiPro Entries* menu item. Find the option titled Core Index Mode, enable Manual - Cron job Task. Below it, you will find the option named Secret word. In that box, you have to enter a password which will allow your CRON job to convince XTDir that it has the right to publish from the call. Think of it as the password required to enter the VIP area of a night club. After you are done, click the Save button on top to save the settings and close the dialog.

Enable Cron mode

TIP

Use only lower- and upper-case alphanumeric characters (0-9, a-z, A-Z) in your secret key. Other characters may need to be manually URL-encoded in the CRON job's command-line. This is error-prone and can cause the indexing operation to never start even though you'll be quite sure that you have done everything correctly.

Most hosts offer a control panel of some kind. There has to be a section for something like "CRON Jobs", "scheduled tasks", and the like. The help screen in there describes how to set up a scheduled job.

TIP

If your host only supports entering a URL in their "CRON" feature, this will most likely not work with XTDir. There is no workaround. It is a hard limitation imposed by your host.

If you are on a UNIX-style OS host (usually, a Linux host), you most probably have access to a command-line utility called wget.

TIP

Do not forget to surround the URL in double-quotes. If you don't the command will fail, and it will be your fault! The reason is that the ampersand is also used to separate multiple commands in a single command-line. If you don't use the double-quotes at the start and end of the indexing URL, your host will think that you tried to run multiple commands and load your site's homepage instead of the front-end indexing URL.

If you are unsure, check with your host. Sometimes you have to get from them the full path to wget in order for CRON to work, thus turning the above command-line to something like:

Contact your host; they usually have a nifty help page for all this stuff. Read also the section on CRON jobs below.

wget is multi-platform command-line utility program. It is frequently included in the operating systems. If your system does not include the wget command, it can be downloaded at this address: https://wget.addictivecode.org/FrequentlyAskedQuestions#download (opens new window). The wget homepage is here: https://www.gnu.org/software/wget/wget.html (opens new window).

TIP

The ampersands above should be written as a single ampersand, not as an HTML entity (&). Failure to do so will result in a 403: Forbidden error message, and no indexing will occur. This is not a bug. It is the way wget works.

# Using EasyCron to automate your indexing

TIP

EasyCron (opens new window) is an online Cron Service.

  • Trusted by the Pros.
  • Save your time.
  • Easy to reason about.
  • Alternative to Linux Cron.

EasyCron offers a poweful set of functions:

  • Standard Cron expression
  • 3 ways to specify execution time
  • Email Notification
  • Cron Job Execution Logs
  • Execution Time Prediction
  • Separate Failure Logs
  • Customize HTTP Method
  • Customize HTTP Headers
  • Powerful API
  • Output Regexp Matching
  • Webhook
  • Configure Timeout
  • Timezone Adaptable
  • User-Friendly Interface
  • No Need to Install

Assuming that you have already bought some credits on EasyCron, here is how to automate your indexing using their service.

Before beginning to use this feature, you must set up XTDir to support the front-end indexing option. First, go to XTDir's main page and click on the Component Configuration / * Core Index of SobiPro Entries* menu item. Find the option titled Core Index Mode, enable Manual - Cron job Task. Below it, you will find the option named Secret word. In that box, you have to enter a password which will allow your CRON job to convince XTDir that it has the right to publish from the call. Think of it as the password required to enter the VIP area of a night club. After you are done, click the Save button on top to save the settings and close the dialog.

Enable Cron mode

We strongly recommend using only alphanumeric characters, i.e. 0-9, a-z and A-Z. For the sake of this example, we will assume that you have entered ak33b4s3cRet in that field. We will also assume that your site is accessible through the URL https://www.example.com.

Log in to the EasyCron. In the CRON area, click on the New Cron button. Here's what you have to enter at the EasyCron interface:

Name of Cron job: anything you like, e.g., "XTDir www.example.com"

Timeout: 180sec; if the indexing doesn't complete, increase it. Most sites will work with a setting of 180 or 600 here. If you have a many sections and entries which takes more than 5 minutes to process, you might consider using XTDir native CRON script (xtdir_indexer.php) instead, as it is much more cost-effective.

Url you want to execute: https://www.example.com/index.php?option=com_xtdir&view=cron&task=run&key=ak33b4s3cRet

Login and Password: Leave them blank

Execution time (the grid below the other settings): Select when you want your CRON job to run

Alerts: If you have already set up alert methods in the EasyCron interface, we recommend choosing an alert method here and not checking the "Only on error" so that you always get a notification when the indexing CRON job runs.

Now click on Submit and you are all set up!

# Free Online CRON Schedulers

If you are still having trouble with the above options, and your hoster does not provide any Cron job support, use any free online Cron job scheduler service available on the web. Do not worry about security because what these sites do is merely accessing the Url at a certain interval, which is open to anyone.

TIP

EasyCron (opens new window) is an online Cron Service.

  • Trusted by the Pros.
  • Save your time.
  • Easy to reason about.
  • Alternative to Linux Cron.

EasyCron offers a poweful set of functions:

  • Standard Cron expression
  • 3 ways to specify execution time
  • Email Notification
  • Cron Job Execution Logs
  • Execution Time Prediction
  • Separate Failure Logs
  • Customize HTTP Method
  • Customize HTTP Headers
  • Powerful API
  • Output Regexp Matching
  • Webhook
  • Configure Timeout
  • Timezone Adaptable
  • User-Friendly Interface
  • No Need to Install

# Using the front-end indexing in SiteGround (opens new window) and other hosts using cURL instead of wget

Finding the correct command to issue for the CRON job is tricky. This recipe applies not only to SiteGround (opens new window), but many other commercial hosts as well.

In the CPanel for SiteGround (opens new window) there is a Cron job option, you create a Cron job using that and use:

as your command.

# Using the front-end indexing in SiteGround (opens new window) and other hosts using lynx instead of wget

Lynx is a text-based browser that is installed in most hosting environments.

On most Linux systems, you can simply run the commands below. We would recommend running the CRON every thirty (30) minutes or less. On a busy site, you might want to run it every ten (10) minutes. The more frequently you run it, the less load there will be on the server.

If you do not have Lynx installed, you can use other alternatives such as wget, detailed above.

Don't worry, this operation actually runs very fast and has a very little impact on the server, equivalent to a normal single page load.

# Setting Up CRON Job in cPanel

To add a new CronJob in cPanel, login to your cPanel and click Cronjobs under the Advanced section as the screenshot below.

To add a new CronJob in cPanel

After clicking Cronjobs, you will be directed to a page similar to the one below. In this example, click Standard to proceed.

After clicking Cronjobs, click Standard to proceed

Enter the following command in the Command to run field on the screen shown below it:

Select Every Five Minutes, Every Hour, Every Day, Every Month, and Every Week Day so that the action above will be executed every five (5) minutes perpetually.

Select Every Five Minutes, Every Hour, Every Day, Every Month, and Every Week Day

TIP

Users report that they get no joy using this script on GoDaddy hosting.

If you have access to the command-line version of PHP, XTDir for SobiPro includes an even better - and faster - way of indexing your messages. All XTDir for SobiPro releases include the file cli/xtdir_indexer.php, which can be run from the command-line PHP interface (PHP CLI). In contrast with previous releases, it doesn't require the front-end indexing in order to work; it is self-contained, native indexing for your Joomla!™ site, even if your web server is down!

In order to schedule a indexing, you will have to use the following command-line to your host's CRON interface:

where /usr/local/bin/php is the path to your PHP CLI executable and /home/USER/webroot is the absolute path to your web site's root. You can get this information from your host.

In order to give some examples, I will assume that your PHP CLI binary is located in /usr/local/bin/php - a common setting among hosts - and that your web site's root is located at /home/johndoe/httpdocs.

Special considerations:

  • Most hosts do not impose a time limit on scripts running from the command-line. If your host does and the limit is less than the required time to publish from your site, the job will fail.
  • This script is not meant to run from a web interface. If your host only provides access to the CGI or FastCGI PHP binaries, xtdir_indexer.php will not work with them. The solution to this issue is tied to the time constraint detailed above.
  • Some servers do not fully support this indexing method. The usual symptoms will be a job which starts but is intermittently or consistently aborted in mid-process without any further error messages and no indication of something going wrong. In such a case, trying running the indexing from the back-end of your site will work properly.

# Setting up a CRON job on cPanel

Go to your cPanel main page and choose the CRON Jobs icon from the Advanced pane. In the Add New CRON Job box on the page which loads, enter the following information:

  • Common Settings, Choose the frequency of your indexing, for example once per day.
  • Command Enter your indexing command. Usually, you have to use something like:

where myusername is your account's user name (most probably the same you use to login to cPanel). Do note the path for the PHP command-line executable: /usr/bin/php-cli. This is the default location of the correct executable file for cPanel and later. Your host may use a different path to the executable. If the command never runs, ask them. We can't help you with that; only those who have set up the server know the changes they have made to the default setup.

Finally, click the Add New Cron Job button to activate the CRON job.

# Special considerations for HostGator

The location of the PHP CLI binary is /usr/bin/php-cli. This means that your CRON command-line should look like:

Finally, it should be noted that you can use the command-line override feature to do more tricky configuration overrides, for example turning off the archive splitting or using a different indexing output directory to enhance your security. If it is something you can do in the Configuration page of the component, you can also do it using command-line overrides.

# A PHP alternative to wget

There is an alternative to wget, as long as your PHP installation has the cURL extension installed and enabled. For starters, you need to save the following PHP script as xtdir.php somewhere your host's CRON feature can find it. Please note that this is a command-line script and does not need to be located in your site's root; it should be preferably located above your site's root, in a non web-accessible directory.

In order to configure it for your server, you only have to change the first three lines.

Where www.yoursite.com and YourSecretKey should be set up as discussed in the previous section.

TIP

The ampersands above should be written as a single ampersand, not as an HTML entity (&). Failure to do so will result in a 403: Forbidden error message, and no indexing will occur. This is not a bug. It is the way wget and PHP work.

In order to call this script with a schedule, you need to put something like this to your crontab (or use your host's CRON feature to set it up):

Where /usr/local/bin/php is the absolute path to your PHP command-line executable and /home/USER/xtdir_indexing/xtdir.php is the absolute path to the script above.

If you set up your CRON schedule with a visual tool (for example, a web interface), the command to execute part is:

Last Updated: 3/20/2024, 3:32:54 PM