Divyaksh Shukla Blog

Posts

Showing posts from 2018

Divyaksh Expense Manager

May 27, 2018

The expense manager apps on the play store are either paid or have in-app purchases or contain lots of advertisements which gets a little annoying. I wanted to make an app that is simple as all the other apps on the play store but be customised as per my usage. Many of the apps on the playstore are either very limited or too bloated (they could have just made tabbed screens 😠) Anyways I wanted an app that could note down the expenses I made and also separate out mode of payment (like cards, cash or online wallets). a feature that is missing in the apps available today Storing a snapshot of the bill is an added bonus. But in my house the whole month's grocery is bought at once from the supermarket. So, the bill is quite huge (its an extremely long piece of paper) and cramming that into one picture would be difficult. So, how about a video of the bill or a panoramic shot of the bill (that would be a lot of work even it is possible). I could also incorporate a message filter tha...

Raspberry pi Stock data collection

May 08, 2018

I made a stock collection program on my raspberry pi that web scrapes data from a website to give me stock updates every 5 minutes. All the details are stored as a file system and it is managed solely by a python program and crontab. The script is in python 3 and it is used to track down prices of BSE (Bombay stock exchange) in India. The whole script is available on my GitHub repository. Link: To run it just place the script in a folder and add the cron entry as stated in the README.md python3 stock.py ^BSESN The data is first extracted via BeautifulSoup and the following tags are title = ['h1', {'data-reactid': '7'}] quote_header = ['div', {'id': 'quote-header-info'}] present_price = ['span', {'class': 'Trsdu(0.3s)'}] opening_price = ['td', {'data-test': 'OPEN-value'}] previous_close_price = ['td', {'data-test': 'PREV_CLOSE-value'}] day_range_pric...

NAS (National Achivement Survey) data extraction

May 08, 2018

I had recently been to a datathon (A hackathon related to data science) in PES University, Bangalore. There my team was given a task to extract data from the National Achievement Survey - 2017 conducted by NCERT. NAS collects data about CBSE schools across states and districts of India to collect data about student achievements and their overall reports. This data is present in PDF formats. We were tasked to extract data from PDF and tabulate it. $ pdftotext is a linux utility to convert pdf to text. By supplying a -layout option the default layout of the data is mostly preserved. I made a python script (pdf_convert.py) to convert the pdf data to text files sequentially. Next I made a script to convert the text files to csv data. So each text file was turned to a record (row) in the csv file. Here is a snapshot of the directory structure of PDF file that we got. . ├── Andaman & Nicobar Islands │ ├── Andaman │ │ ├── Andamans Class - ...