Enable Jekyll's LSI for related posts with fast build speed

Enable Jekyll's LSI for related posts with fast build speed

boost build speed with gsl

Jekyll’s site.related_posts by default just presents 10 recent posts. If you set the lsi to true in _config.yml, the related_posts can really work as it describes. However, enable lsi will certainly slow down the building speed. Especially for posts that written in Chinese, seems like the latent semantic indexing (LSI) will never stop and the Jekyll build process will last for hours and hours…

So, I end up with a manually programmed related posts by targetting the posts with the same tag/category using pure Liquid in Jekyll’s templates. Details can be found in my old post: Related posts in Jekyll.

What is LSI?

Why enable lsi will lead to such slow build speed? We need to have a general feeling on what is the LSI - latent sematic indexing.

LSI, sometimes referred as latent semantic analysis, is a mathematical method developed in the late 1980s to improve the accuracy of information retrieval. It uses a technique called singular value decomposition to scan unstructured data within documents and identify relationships between the concepts contained therein.

In essence, it finds the latent relationships between words (semantics) in order to improve information understanding (indexing). It provided a significant step forward for the field of text comprehension as it accounted for the contextual nature of language.

So, it needs additional calculations among different posts for finding the related posts.

Speed up LSI

It’s great to have lsi enabled for accurate related posts, and things become easier with rb-gsl that speed up LSI immensely.

However, rb-gsl requires gsl (GNU Scientific Library) as the runtime dependency, you need to install gsl locally on your build environment.

On macOS, that’s easy with Homebrew:

brew install gsl

On Ubuntu/Debian:

sudo apt-get -y install libgsl-dev

Then, install these two gems or add them to your Gemfile then install them with bundler:

gem install classifier-reborn
gem install gsl

Now, you can safely enable lsi to build related posts with super fast speed.

In case you’re testing your site that doesn’t care about the related posts, you can set lsi to false in _config.yml and build related posts to only in production environment with bundle exec jekyll build --lsi.

Note that GitHub Pages doesn’t support lsi… But, Netlify has already added gsl in their building image 👍.

Ads by Google

林宏

Frank Lin

Hey, there! This is Frank Lin (@flinhong), one of the 1.41 billion 🇨🇳. This 'inDev. Journal' site holds the exploration of my quirky thoughts and random adventures through life. Hope you enjoy reading and perusing my posts.

YOU MAY ALSO LIKE

Related posts in Jekyll using pure Liquid

Web Notes

2016.07.13

Related posts in Jekyll using pure Liquid

Jekyll features a simple “Related posts” variable per post page with "site.related_posts", which just contains the 10 most recent posts by default. It only works perfectly when LSI (latent semantic indexing) option was enabled (slow building speed). The Liquid tags might be helpful here.

Using Liquid in Jekyll - Live with Demos

Web Notes

2016.08.20

Using Liquid in Jekyll - Live with Demos

Liquid is a simple template language that Jekyll uses to process pages for your site. With Liquid you can output complex contents without additional plugins.

Setup an IKEv2 server with StrongSwan

Tutorials

2020.01.09

Setup an IKEv2 server with StrongSwan

IKEv2, or Internet Key Exchange v2, is a protocol that allows for direct IPSec tunnelling between two points. In IKEv2 implementations, IPSec provides encryption for the network traffic. IKEv2 is natively supported on some platforms (OS X 10.11+, iOS 9.1+, and Windows 10) with no additional applications necessary, and it handles client hiccups quite smoothly.

TOC