Technology Watcher (Internship side quest)
Why
In my end of study intership i figured out i was not enough efficient in my technology watch so i found out the solution was to make bot sending me update on my research field.
User POV
To agregate all the info i put everything in my glance app. It’s beautiful, easy to use/setup, and work with RSS, youtube, reddit …
I decided to use that if one day i make more bot for different subject, but currently i didn’t made other.
System overview
My system if a very basic script that can be synthetise like this :
result = {}
result += search_on_google()
result += search_on_researchgate()
result = remove_known_page()
build_rss_feed(result)
The more tricky part is that for each website i use i need to build a custom scraper.
Upgrades
I tried to work on a filtering and fine tuning system, the goal was that my dork was able to adapt to new tendencies and the algorithm won’t show me totally irrelevant stuff.
As you can see a basic TF-IDF with basic filtering was performing well to this task but sometimes i had to had some ban word, like very frequent but not relevant.
And because i wanted to make an easy to use system i needed a Web view for selection of bad and good words, but i hate web, i’m bad at it, and i need more time to work on my real research so i just live it in a part of my mind.
Conclusion
As is, the system work but there is to many useless result, i may need to had automatize filtering if i want to publish it on github but as is, just for me it’s good enough.
Also sometimes i found very interesting stuff event if it’s not in my field.