KWOC is the 5-week long online program for the students who are new to open source software development. The program acts to be the platform for helping students take the steps to get involved in open source and making contributions to the projects. This also prepares students for other open source programs like Google Summer of Code.

Getting started

After completing my course on machine learning in November, I was looking to contribute in open source community. Earlier my seniors suggested me of KWOC, so I made a quick google search and found that it’s goint to start soon. I was pretty excited to contribute in KWOC as it was like a practice session for Google Summer Of Code.

Choosing the Projects

I wanted to work mainly on the projects that were using Python, Machine Learning, Deep Learning algorithms and frameworks and thus, adhered to them. I went through projects like Stock Market Forecasting, Benji, Sangita, Merkalysis, Ball Sacker, Artemis Arrow, Imagery and started working on them. In the long run I decided to continue with Stock Market Forecasting.

Work/Challenges

1. STOCK MARKET FORECASTING

The main reason I adhered to this project was I liked approach of my mentor. He introduced some bugs(partial updates) in the codebase so that whenever a person tries to understand the code he will also update and debug it with his own logic. This made learning the codebase quite a lot of fun.

At the very beginning, I updated the webcrawlers of the project with urllib3 library of python and fixed some minor bugs. I also got to know how webscrapping can be easily done with beautiful-soup.(library of python).

Afterwards,during mid evaluation I worked on enhancing the preprocessing steps of the training model. I added a module through which we can use pretrained word vectors to make word embeddings for our vocabulary. Also, using that I made word vectors for the project vocabulary using GloVe and stored it seperately.

Finally, I started working on enhancing the Deep Learning model. I almost Trained 50-80 different models with the help of tensorflow-gpu on training set containing 100k news. I learned various techniques to improve the accuracy of model. After lot of experiments I was able to find that stacked Bi-GRU model is the best model for a given task. I achieved accuracy of 96.50% with my model.

During my entire journey I was learning quite a lot of things everyday.

List Of Pull Request’s
#2 : Updated One web crawler.
#5 : Updated all web crawlers’s
#7 : Made word embeddings from vocablury
#10 : Updated readme.md and added final model

Verdict

I learned a lot of new things while contributing in these projects and was able to work on new libraries and technologies in Machine Learning. My learning rate was exponential. Being a newbie it also helped me in getting well-acquainted in version control tools like git and git-hub. I am highly grateful to my mentors for guiding me to overcome the problems I faced. Also, I would like to thank Kharagpur Open Source Society for hosting this program which provides opportunity to students across various colleges to work on such fascinating projects.

Toshal Agrawal

https://github.com/walragatver