Tutorials for Joomla

Basic Configuration

Video Tutorial: Basic Configuration for Joomla

STEP 1: Create an Algolia account

Go to Algolia and create your account.

In the Algolia Dashboard you will find your API Keys:

  • Application ID: This is your unique application identifier. It's used to identify you when using Algolia's API.
  • Search-Only API Key: This is the public API key to use in your frontend code. This key is only usable for search queries.
  • Admin API Key: This is the ADMIN API key. Please keep it secret and use it ONLY from your backend: this key is used to create, update and DELETE your index. You can also use it to manage your API keys.

Algolia Keys

You also need to create your Index, go to Indices and create it.

  • Index: Where the data is indexed and the search parameters configured.

Algolia Indices

Algolia Indices

STEP 2: Configure the XT Search for Algolia

Go to XT Search for Algolia, click on Options and fill the Application ID, Search-Only API Key, Admin API Key and Index Name:

XT Search Configuration

  • Connectors Configuration: To define the extension that is going to be indexed by XT Search in Algolia and other associated parameters that control the connector.

Connectors Configuration

STEP 2.1: Configure the content connectors

XT Search for Algolia comes with a growing set of content connectors. They are used to look for and retrieve the available content items. Select the required connectors and configure the optional filters if it is necessary.

Configure the content connectors

STEP 3: Index Generation

When the Connector and Algolia are configured, the extension is ready to be indexed. There are two ways to do it:

  • Manually in the Control Panel
  • Automatically with a Cron Job

STEP 3.1 Manual Indexing

This method is the simplest and can be executed directly on the backend.

To execute this option, please, click on the button and wait until the process is completed.

Index

Index Progress

At this point, the index records have been generated, and you can browse them in Algolia Dashboard.

Full, Incremental, or Batch Indexing

There are three alternatives to perform the indexing of records:

  • Full Indexing: If you execute the Full Indexing, all records will be indexed. Additionally, all previous records will be cleared each time it is called. This is the best and the most straightforward option to index all records.
/usr/local/bin/php /home/johndoe/httpdocs/cli/xtdir4alg_cli.php XTSearchForAlgolia:index
  • Incremental Indexing: This option only indexes new records, following the Options/"Incremental Indexing (Number of Days)" parameter. If you start running out of Algolia ops, you can begin using Incremental Indexing to optimize the process.
/usr/local/bin/php /home/johndoe/httpdocs/cli/xtdir4alg_cli.php XTSearchForAlgolia:index --incremental=1
  • Batch Indexing: Finally, the last alternative is the Batch Indexing, specially targeted to big set of records. It works similarly to the Full Indexing, but it processes the tasks in batchs to avoid server overloads.
/usr/local/bin/php /home/johndoe/httpdocs/cli/xtdir4alg_cli.php XTSearchForAlgolia:batch

The batch processing also supports the incremental variation.

STEP 3.2 Cron Automation

You can configure the Cron job process to automate the indexation.

Cron

Overview of CRON jobs

To run smoothly, you will need to set up a CRON job to execute periodically.

A cron job task can be executed every day or at any given frequency.

There are two ways to execute the CRON job:

  • Command Line Interface (CLI) - Native CRON script (recommended)
  • Web CRON job script

Users report that they get no joy using this script on GoDaddy hosting.

If you have access to the command-line version of PHP, XT Search for Algolia includes an even better - and faster - way of indexing your records.

In order to schedule a indexing, you will have to use the following command line to your host's CRON interface:

/usr/local/bin/php /home/USER/webroot/cli/xtdir4alg_cli.php XTSearchForAlgolia:index

where /usr/local/bin/php is the path to your PHP CLI executable and /home/USER/webroot is the absolute path to your web site's root. You can get this information from your host.

In order to give some examples, I will assume that your PHP CLI binary is located in /usr/local/bin/php - a common setting among hosts - and that your web site's root is located at /home/johndoe/httpdocs.

/usr/local/bin/php /home/johndoe/httpdocs/cli/xtdir4alg_cli.php XTSearchForAlgolia:index

Special considerations:

  • Most hosts do not impose a time limit on scripts running from the command-line. If your host does and the limit is less than the required time to publish from your site, the job will fail.
  • This script is not meant to run from a web interface. If your host only provides access to the CGI or FastCGI PHP binaries, xtdir4alg_cli.php XTSearchForAlgolia:index will not work with them. The solution to this issue is tied to the time constraint detailed above.
  • Some servers do not fully support this indexing method. The usual symptoms will be a job which starts but is intermittently or consistently aborted in mid-process without any further error messages and no indication of something going wrong. In such a case, trying running the indexing from the back-end of your site will work properly.

Setting up a CRON job on cPanel

Go to your cPanel main page and choose the CRON Jobs icon from the Advanced pane. In the Add New CRON Job box on the page which loads, enter the following information:

  • Common Settings, Choose the frequency of your indexing, for example once per day.

  • Command Enter your indexing command. Usually, you have to use something like:

    /usr/bin/php5-cli /home/myusername/public_html/cli/xtdir4alg_cli.php XTSearchForAlgolia:index

where myusername is your account's user name (most probably the same you use to login to cPanel). Do note the path for the PHP command line executable: /usr/bin/php5-cli. This is the default location of the correct executable file for cPanel 11 and later. Your host may use a different path to the executable. If the command never runs, ask them. We can't help you with that; only those who have set up the server know the changes they have made to the default setup.

Finally, click the Add New Cron Job button to activate the CRON job.

Special considerations for HostGator

The location of the PHP CLI binary is /usr/bin/php-cli. This means that your CRON command line should look like:

/usr/bin/php-cli /home/myusername/public_html/cli/xtdir4alg_cli.php XTSearchForAlgolia:index

Finally, it should be noted that you can use the command-line override feature to do more tricky configuration overrides, for example turning off the archive splitting or using a different indexing output directory to enhance your security. If it is something you can do in the Configuration page of the component, you can also do it using command line overrides.

In order to call this script with a schedule, you need to put something like this to your crontab (or use your host's CRON feature to set it up):

0 3 * * 6 /usr/local/bin/php /home/USER/public_html/cli/xtdir4alg_cli.php XTSearchForAlgolia:index

Where /usr/local/bin/php is the absolute path to your PHP command-line executable and /home/USER/public_html/cli/xtdir4alg_cli.php XTSearchForAlgolia:index is the absolute path to the script above.

If you set up your CRON schedule with a visual tool (for example, a web interface), the command to execute part is "/usr/local/bin/php /home/USER/public_html/cli/xtdir4alg_cli.php XTSearchForAlgolia:index".

Web Front-end processing, for use with CRON

The front-end processing feature is intended to provide the capability to perform an unattended, scheduled indexing in your site.

The front-end indexing URL performs a single indexing step. You will only see a message upon completion, should it be successful or not. There are a few limitations, though:

It is not designed to be run from a normal web browser, but from an unattended CRON script, utilizing wget or CRON as a means of accessing the function.

The script is not capable of showing progress messages.

Do you want to automate your indexing despite your host not supporting CRON? Webcron.org fully supports XT Search's front-end indexing feature and is dirt cheap - you need to spend about 1 Euro for 1000 indexing runs.

Before beginning to use this feature, you must set up XT Search to support the front-end indexing option. First, go to XT Search's main page and click on the Component Configuration / Cron Job item. Below it, you will find the option named Base URL Override and Secret word. In that box you have to enter a password which will allow your CRON job to convince XT Search that it has the right to publish from the call. After you are done, click the Save button on top to save the settings and close the dialog.

Use only lower- and upper-case alphanumeric characters (0-9, a-z, A-Z) in your secret key. Other characters may need to be manually URL-encoded in the CRON job's command line. This is error-prone and can cause the indexing operation to never start even though you'll be quite sure that you have done everything correctly.

Most hosts offer a CPanel of some kind. There has to be a section for something like "CRON Jobs", "scheduled tasks" and the like. The help screen in there describes how to set up a scheduled job. One missing part for you would be the command to issue. Simply putting the URL in there is not going to work.

If your host only supports entering a URL in their "CRON" feature, this will most likely not work with XT Search. There is no workaround. It is a hard limitation imposed by your host. We would like to help you, but we can't. As always, the only barrier to the different ways we can help you is server configuration.

If you are on a UNIX-style OS host (usually, a Linux host) you most probably have access to a command line utility called wget. It is almost trivial to use:

wget "https://your.domain.com/index.php?option=com_xtdir4alg&task=cron&key=YourSecretKey"

Do not forget to surround the URL in double quotes. If you don't the command will fail and it will be your fault! The reason is that the ampersand is also used to separate multiple commands in a single command line. If you don't use the double quotes at the start and end of the indexing URL, your host will think that you tried to run multiple commands and load your site's homepage instead of the front-end indexing URL.

If you are unsure, check with your host. Sometimes you have to get from them the full path to wget in order for CRON to work, thus turning the above command line to something like:

/usr/bin/wget "https://your.domain.com/index.php?option=com_xtdir4alg&task=cron&key=YourSecretKey"

Contact your host; they usually have a nifty help page for all this stuff. Read also the section on CRON jobs below.

wget is multi-platform command line utility program which is not included with all operating systems. If your system does not include the wget command, it can be downloaded at this address: https://wget.addictivecode.org/FrequentlyAskedQuestions#download. The wget homepage is here: https://www.gnu.org/software/wget/wget.html.

The ampersands above should be written as a single ampersand, not as an HTML entity (&). Failure to do so will result in a 403: Forbidden error message and no indexing will occur. This is not a bug, it is the way wget works.

Online CRON Schedulers

If you are still having trouble with the above options, and your hoster does not provide any Cron job support, use any online Cron job scheduler service available on the web. Do not worry about security because what these sites do is merely accessing the Url at a certain interval, which is open to anyone.

Popular services include:

Using the front-end indexing in SiteGround and other hosts using cURL instead of wget

Finding the correct command to issue for the CRON job is tricky. This recipe applies not only to SiteGround, but many other commercial hosts as well.

In the CPanel for SiteGround there is a Cron job option, you create a Cron job using that and use:

curl -b /tmp/cookies.txt -c /tmp/cookies.txt -L -v "<url>"

as your command.

Using the front-end indexing in SiteGround and other hosts using lynx instead of wget

Lynx is a text-based browser that is installed in most hosting environments.

On most Linux systems, you can simply run the commands below. We would recommend running the CRON every thirty (30) minutes or less. On a busy site, you might want to run it every ten (10) minutes. The more frequently you run it, the less load there will be on the server.

lynx -source "https://your.domain.com/index.php?option=com_xtdir4alg&task=cron&key=My-Secret" > /dev/null

If you do not have Lynx installed, you can use other alternatives such as wget, detailed above.

Don't worry, this operation actually runs very fast and has a very little impact on the server, equivalent to a normal single page load.

Setting Up CRON Job in cPanel 11

To add a new CronJob in cPanel 11, login to your cPanel and click Cronjobs under the Advanced section as the screenshot below.

To add a new CronJob in cPanel 11

After clicking Cronjobs, you will be directed to a page similar to the one below. In this example, click Standard to proceed.

After clicking Cronjobs, click Standard to proceed

Enter the following command in the Command to run field on the screen shown below it:

lynx -source "https://your.domain.com/index.php?option=com_xtdir4alg&task=cron&key=My-Secret" > /dev/null

Select Every Five Minutes, Every Hour, Every Day, Every Month, and Every Week Day so that the action above will be executed every five (5) minutes perpetually.

Select Every Five Minutes, Every Hour, Every Day, Every Month, and Every Week Day

Advanced Algolia Configuration

Algolia provides an extensive configuration to define and optimize the search. The following screenshots are included as a reference of how the Demo site is configured.

Demo Searchable Fields

Algolia Searchable Fields

Demo Ranking

Algolia Ranking

Demo Facets

Algolia Facets

Instant Search

Demo Autocomplete

Autocomplete