Project Description
During the training period I was involved with 2 live projects, One project named ‘Stocktwits Data Structurization’ in which I have to process huge JSON Data which was already obtained the size of data was nearly 5 GB need to process the data by chunking with chunk size = 20000 rows at a time. The file has nested JSON data within it’s attributes so abstracts data from the nested columns into a new dataframe. Completed handling complex nested json formed columns abstracted from nested json. Then need to Handle the missing data by mapping it with another index dataset further missing values for certain attributes were handled by mean value and 0 substitution. This task involves numerous pandas operations along with multiple python functions. Further done Exploratory Data Analysis on the cleaned dataset finding correlation matrix and plotting certain seaborn graphs between strong correlated attributes.
Ещё видео!