Build your own search Engine

Google is evil!

Now we have SEARX. It’s an open source software based on python that allows you to build your own “search engine” without worrying about being spied on. Well it’s not technically a real search engine. It’s basically a spider that checks your search keywords on multiple search engines, filters, sorts and presents the results to you in one single page. You do not directly access those search engines, which means your ip, cookies and private data etc. are protected from the evil of capitalism. It’s highly customizable most importantly, it has a lot of themes XD;

You can find available public searx instances here.

This is an instruction of installing, configuring and deploying your own searx instance. Following this, you should have a search engine at the end. It:

  • runs in python virtualenv
  • uses https (Certificate provided by let’s encrypt)
  • uses nginx as web server

This is my own instance: searx.shrik3.com

pre INSTALL

There are two ways of installing searx.

  • install searx with docker, it’s very easy to install, but further configuration is a pain in the ass.
  • manually install with python-virtualenv, it’s a little complex to install but easier to configure.

Note: there is a non-official docker image of searx, which allows user to pass config file into the container

For both methods, instructions can be found in official documents

I’m using method II.

Python2.x will no longer be supported in the future. If your OS has Python2.x as default, it’s highly suggested to install python3.

INSTALLATION

Use whatever package manager in your OS to install the following packages: e.g. apt-get in ubuntu or yum in CentOS.

Install dependencies

$ apt-get install \
       git build-essential libxslt-dev \
       python-dev python-virtualenv python-babel \
       zlib1g-dev libffi-dev libssl-dev

Note: if you have both python2.x and 3.x installed on your system, and python2 is default(check with python -V) Install package `python3-virtualenv’ instead

Install Searx

$ cd /usr/local
$ sudo  git clone https://github.com/asciimoo/searx.git
$ sudo  useradd searx -d /usr/local/searx
$ sudo  chown searx:searx -R /usr/local/searx
$ cd /usr/local/searx
$ sudo -H -u searx -i
(searx)$ virtualenv searx-ve
(searx)$ . ./searx-ve/bin/activate
(searx)$ ./manage.sh update_packages

Note: if you have python2 installed as default python, but you want to use python3 here, use ‘virtualenv –python=/usr/bin/python3’ argument to specify your python version(use ‘which python3’ to find out python3 location)

Then use ‘exit’ to switch back to regular user.

Configure nginx (without ssl)

Make sure nginx is installed and the service is started. Make sure port 80 is allowed by firewall. Bind your (sub)domain to your host ip. e.g. searx.example.com

Edit nginx config file:

server {
    listen 80;
    server_name searx.example.com;

    location /static {
        alias /usr/local/searx/searx/static;
    }

    location / {
        proxy_pass http://127.0.0.1:8888;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Scheme $scheme;
        proxy_set_header X-Script-Name /searx;
        proxy_buffering off;
    }
}

Then run service nginx reload load the new config.

Note: I suggest you read THIS if you have no idea how to config nginx

Test Run

go into your searx install directory ('/usr/local/searx’ in this case), and edit the searx/settings.ymlfile. Modify ‘base_url’ to http://searx.example.com. Change other options if you want..

Now run the following commands to run searx:

$ cd /usr/local/searx
$ virtualenv --python=/usr/bin/python3 searx-ve
$ . ./searx-ve/bin/activate
$ python3 searx/webapp.py

if nothing went wrong, you should be able to see your searx page in browser(searx.example.com).

UPGRADE TO HTTPS!

Issue a cert with acme.sh

Privacy is always the NO.1 priority! Now we update the site to https; Luckily there is a very handy tool:

https://github.com/acmesh-official/acme.sh

Follow the instruction to issue a certificate for your (sub)domain. Or you can follow the instruction here below, (ONLY FOR NGINX).

curl  https://get.acme.sh | sh
acme.sh --issue  -d searx.example.com   --nginx

sudo mkdir -p /etc/nginx/certs/searx

sudo acme.sh --installcert -d searx.example.com \
--key-file       /etc/nginx/certs/searx/key.pem  \
--fullchain-file /etc/nginx/certs/searx/cert.pem \
--reloadcmd     "service nginx force-reload"

Modify Nginx config

now modify your nginx config file into this:

server {
    listen 80;
    server_name searx.example.com;
    return 302 https://$server_name$request_uri;
}

server {
    listen 443 ssl;
    server_name searx.example.com;
    ssl_certificate /etc/nginx/certs/searx/cert.pem;
    ssl_certificate_key /etc/nginx/certs/searx/key.pem;
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_ciphers 'AES128+EECDH:AES128+EDH';
    ssl_prefer_server_ciphers on;

    access_log  off;
    error_log   off;

    location /static{
            alias /usr/local/searx/searx/static;
    }

    location / {
            proxy_pass http://127.0.0.1:8888;
            proxy_set_header Host $host;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Scheme $scheme;
            proxy_set_header X-Script-Name /searx;
            proxy_buffering off;
    }
}

Then use service nginx reload to reload nginx. and run searx with the commands above again. Now your searx site should be encrypted with https!

NOTES:

Searx can search for files and the results are mostly magnet links. However downloading using torrent is illegal in some country like Germany. You may want to disable file search in order not get yourself in trouble.

You can use nohup to run searx in the background.e.g.

nohup python3 searx/webapp.py >/dev/null 2>&1 &

Don’t forget to switch to virtual environment with virtualenv before each time you start up searx webapp.py :

$ virtualenv --python=/usr/bin/python3 searx-ve
$ . ./searx-ve/bin/activate

Alternatively you may want to run it as system service with systemctl. See This