Hi everyone,
I'm working on ahmia.fi, the hidden service search engine and you're
reading status update #3.
Since last time, I've been moving forward with the django "rewrite".
A little context: The current main goal of the project is too regroup
all data related to search into our elasticsearch index (sites content,
tor2web stats) so we can use it to give better search results.
Since I'm focusing on the django part, I'm removing a lot of search
logic which is now done by elasticsearch. For instance, banned websites
will be store in our main index and elasticsearch will filter them
before returning search results. This is a different behaviour from
now, where banned websites are stored in a PostgreSQL database and
filtered at the django level.
I'm also taking advantage of this rewrite to accomplish the following
goals:
- Compartimentalize what should be in apps : I plan to do three apps :
search, stats and api
- Remove deprecated code/files
- Remove every linter error/warning : I worked on this one, but some
warning could not be remove without a rewrite.
- Update django version : We are using Django 1.7 which is not
supported anymore.
- Use a more coherent url scheme (while keeping retro-compatibility)
- Use a more coherent API behaviour (also while keeping retro-
compatibility)
- Write more tests to verify correctness
After this rewrite, we will have three apps in our Django website:
Search will focus on pages related to search results, querying our
index to get results.
It will be later improved with !bang syntax or some other keywords to
specify a date range, a specific .onion website, etc.
Stats will focus on statistic visualization (most popular .onion, etc).
It could be later improved with a "Ahmia trend" interface to view
searches over time.
Api will focus on the API part of the website. I have a couple ideas
but will discuss them in a later report.
During the next two weeks, I will continue my work on this rewrite.
See you in two weeks :)
Ismael